Sei sulla pagina 1di 139

Principles of Programming Languages

Unit-1
Introduction:
A programming language is an artificial language designed to communicate instructions to a
machine, particularly a computer. Programming languages can be used to create programs that
control the behavior of a machine and/or to express algorithms precisely.
The earliest programming languages predate the invention of the computer, and were used to
direct the behavior of machines such as Jacquard looms and player pianos. Thousands of
different programming languages have been created, mainly in the computer field, with many
more being created every year. Most programming languages describe computation in an
imperative style, i.e., as a sequence of commands, although some languages, such as those that
support functional programming or logic programming, use alternative forms of description.
The description of a programming language is usually split into the two components of syntax
(form) and semantics (meaning). Some languages are defined by a specification document (for
example, the C programming language is specified by an ISO Standard), while other languages,
such as Perl 5 and earlier, have a dominant implementation that is used as a reference.

1. What is a name in programming languages?


A name is a mnemonic character string used to represent something else.
Names are a central feature of all programming languages. In the earliest programs, numbers
were used for all purposes, including machine addresses. Replacing numbers by symbolic names
was one of the first major improvements to program notation.
e.g. variables, constants, executable code (methods, procedures, subroutines, functions, even
whole programs), data types, classes, etc.
In general, names are of two different types:
1. Special symbols: +, -, *
2. Identifiers: sequences of alphanumeric characters (in most cases beginning with a letter), plus
in many cases a few special characters such as '_' or '$'.

2. What is binding?
A binding is an association between two things, such as a name and the thing it names
E.g. the binding of a class name to a class or a variables name to a variable
Static and Dynamic binding:
A binding is static if it first occurs before run time and remains unchanged throughout program
execution.
A binding is dynamic if it first occurs during execution or can change during execution of the
program.

3. What is binding time?


Binding time is the time when an association is established.
In programming, a name may have several attributes, and they may be bound at different times.
Page 1

Principles of Programming Languages

Example1:
int n;
n = 6;
first line binds the type int to n and the second line binds the value 6 to n. The first binding
occurs when the program is compiled. The second binding occurs when the program is executed.
Examlpe2:
void f()
{
int n=7;
printf("%d", n);
}void main ()
{ int k;
scanf("%d", &k);
if (k>0)
f();
}
In FORTRAN, addresses are bound to variable names at compile time. The result is that, in the
compiled code, variables are addressed directly, without any indexing or other address
calculations. (In reality, the process is somewhat more complicated. The compiler assigns an
address relative to a compilation unit. When the program is linked, the address of the unit within
the program is added to this address. When the program is loaded, the address of the program is
added to the address. The important point is that, by the time execution begins, the absolute
address of the variable is known.)
FORTRAN is efficient, because absolute addressing is used. It is inflexible, because all
addresses are assigned at load time. This leads to wasted space, because all local variables
occupy space whether or not they are being used, and also prevents the use of direct or indirect
recursion.
Early and Late Binding:
Early binding - efficiency
Late binding -flexibility

4. Explain about different times at which decisions may be bound?


Or
Explain different types of binding times?
In the context of programming languages, there are quite a few alternatives for binding time.
1.
2.
3.
4.
5.

Language design time


Language implementation time
Program writing time
Compile time
Link time
Page 2

Principles of Programming Languages

6. Load time
7. Run time
1. Language design time:
In most languages, the control flow constructs the set of fundamental types, the available
constructors for creating complex types, and many other aspects of language semantics
are chosen when the language is designed.
Or
During the design of the programming languages, the programmers decides what symbols
should be used to represent operations
e.g. .Binding of operator symbols to operations
( * + ) (Multiplication, addition )

2. Language Implementation Time:


Most language manuals leave a variety of issues to the discretion of the language
implementor. Typical examples include the precision of the fundamental types, the
coupling of I/O to the operating systems notion of files, the organization and maximum
sizes of stack and heap and the handling of run-time exceptions such as arithmetic
overflow.
Or
During language implementation programmers decides what should be the range of
values given to a data type
Bind data type, such as int in C to the range of possible values
Example: the C language does not specify the range of values for the type int. Implementations
on early microcomputers typically used 16 bits for ints, yielding a range of values from -32768
to +32767. On early large computers, and on all computers today, C implementations typically
use 32 bits for int.

3. Program writing time:


Programmers of course, choose algorithms, data structures and names
Or
While writing programs, programmers bind certain names with procedure, class etc
Example: many names are bound to specific meanings when a person writes a program
4. Compile time:
The time when a single compilation unit is compiled, while compiling the type of a
variable can be identified
Example: int c; [at compile time int, c forms an association]

Page 3

Principles of Programming Languages

5. Link time:
The time when all the compilation units comprising a single program are linked as the
final step in building the program
[The separate modules of a single program will be bound only at link time]
6. Load time:
Load time refers to the point at which the operating system loads the program into
memory so that it can run
7. Run time:
Run time is actually a very broad term that covers the entire span from the beginning to
the end of execution. If we give the value of a variable during runtime, it is known as
runtime binding
Ex. Printf (Enter the value of X);
Scanf (%f, &X);

5. What is Scope?
The textual region of the program in which a binding is active is its scope.
The scope of a name binding is the portion of the text of the source program in which that
binding is in effect - i.e. the name can be used to refer to the corresponding object. Using a name
outside the scope of a particular binding implies one of two things: either it is undefined, or it
refers to a different binding. In C++, for example, the scope of a local variable starts at the
declaration of the variable and ends at the end of the block in which the declaration appears.
Scope is a static property of a name that is determined by the semantics of the PL and the text of
the program.
Example:
class Foo
{
private int n;
void foo() {
// 1
}
void bar() {
int m,n;
...
// 2
}
...
Page 4

Principles of Programming Languages

A reference to m at point 1 is undefined


A reference to n at point 1 refers to an instance variable of Foo;

6. Describe the difference between static and dynamic scoping(scope rules)


The scope rules (static, dynamic) of a language determine how references to names are
associated with variables
Static Scoping:
In a language with static scoping, the bindings between names and objects can be
determined at compile time by examining the text of the program, without consideration of the
flow of control at run time.
Scope rules are somewhat more complex in FORTRAN, though not much more. FORTRAN
distinguishes between global and local variables. The scope of a local variable is limited to the
subroutine in which it appears; it is not visible elsewhere. Variable declarations are optional. If a
variable is not declared, it is assumed to be local to the current subroutine and to be of type
integer if its name begins with the letters I-N, or real otherwise.
Global variables in FORTRAN may be partitioned into common blocks, which are then imported
by subroutines. Common blocks are designed for separate compilation: they allow a subroutine
to import only the sets of variables it needs. Unfortunately, FORTRAN requires each subroutine
to declare the names and types of the variables in each of the common blocks it uses, and there is
no standard mechanism to ensure that the common blocks it uses, and there is no standard
mechanism to ensure that the declarations in different subroutines are the same.
Nested scopes- Many programming languages allow scopes to be nested inside each other.
Example: Java actually allows classes to be defined inside classes or even inside methods, which
permits multiple scopes to be nested.
class Outer
{
int v1; // 1
void methodO()
{
float v2; // 2
class Middle
{
char v3; // 3
void methodM()
Page 5

Principles of Programming Languages

{
boolean v4; // 4
class Inner
{
double v5; // 5
void methodI()
{
String v6; // 6
}
}
}
}
}
}
The scope of the binding for v1 the whole program
The scope of the binding for v2 is methodO and all of classes
Middle and Inner, including their methods
The scope of the binding for v3 is all of classes Middle and
Inner, including their methods
The scope of the binding for v4 is methodM and all of class
Inner, including its method
The scope of the binding for v5 is all of class Inner, including
its method
The scope of the binding for v6 is just methodI
Some programming languages - including Pascal and its descendants (e.g. Ada)
procedures to be nested inside procedures. (C and its descendants do not allow this)

allow

Declaration order
- A field or method declared in a Java class can be used anywhere in the class, even before its
declaration.
- A local variable declared in a method cannot be used before the point of its declaration
Example:

class Demo
{
public void method()
{
// Point 1
int y;
}
private int x;
}
Page 6

Principles of Programming Languages

The instance variable x can be used at Point 1, but not y

Example: Java
class Demo
{
public void method1()
{
...
method2();
...
}
public void method2()
{
...
method1();
...
}
}

Example: C/C++:
void method2();

// Incomplete declaration

void method1();
{
...
method2();
...
}
void method2()
{
...
method1();
...
}

// Definition completes the abov

Page 7

Principles of Programming Languages

Dynamic Scoping
In a language with dynamic scoping, the bindings between names and objects
depend on the flow of control at run time, and in particular on the order in which subroutines are
called.
Dynamic scope rules are generally quite simple: the current binding for a given name is the one
encountered most recently during execution, and not yet destroyed by returning from its scope.
Languages with dynamic scoping include APL, Snobol, Perl etc. Because the flow of control
cannot in general be predicted in advance, the binding between names and objects in a language
with dynamic scoping cannot in general be determined by a compiler. As a result, many semantic
rules in a language with dynamic scoping become a matter of dynamic semantics rather than
static semantics.
Ex. Procedure Big is
X:integer;
Procedure sub1 is
X:integer
begin.end
Procedure sub2 is
begin.
X;
end;
begin-----end;
In dynamic scoping the X inside sub2 may refer either to the X in Big or X in sub2 based on the
calling sequence of procedures

7. What is Local Scope?


In block-structured languages, names declared in a block are local to the block. In Algol 60 and
C++, local scopes can be as small as the programmer needs. Any statement context can be
instantiated by a block that contains local variable declarations. In other languages, such as
Pascal, local declarations can be used only in certain places. Although a few people do not like
the fine control offered by Algol 60 and C++, it seems best to provide the programmer with as
much freedom as possible and to keep scopes as small as possible.
Local scope in C++
{
// x not accessible.
t x;
// x accessible
....
{ // x still accessible in inner block
....
}// x still accessible
}// x not accessible

Page 8

Principles of Programming Languages

8. What is Global Scope?


The name is visible throughout the program. Global scope is useful for pervasive entities, such as
library functions and fundamental constants (_ = 3.1415926) but is best avoided for application
variables.
FORTRAN does not have global variables, although programmers simulate them by over using
COMMON declarations. Subroutine names in FORTRAN are global.
Names declared at the beginning of the outermost block of Algol 60 and Pascal programs have
global scope. (There may be holes in these global scopes if the program contains local
declarations of the same names.)
9. Explain about Object Lifetime?
The word "lifetime" is used in two slightly different ways
1. To refer to the lifetime of an object
2. The lifetime of the binding of a name to an object.
a. An object can exist before the binding of a particular name to it
Example (Java):
void something(Object o) {
// 2
}
....
Object p = new Object()
// 1
....
something(p);
....
// 3
(The object named by p exists at point 1, but the name o is not
bound to it until point 2)

b. An object can exist after the binding of a particular name to it has ceased
Example:
void something(Object o) {
// 2
}
....
Object p = new Object()
Page 9

Principles of Programming Languages

// 1
....
something(p);
....
// 3
(The object named by p continues to exist at point 3, even though the binding of the name
o to it ended when method something() completed)
c. A name can be bound to an object that does not yet exist
Example (Java)
Object o;
// 1
....
o = new Object(); // 2
(At point 1, the name o is bound to an object that does not come into existence until point
2)
d. A name can be bound to an object that has actually ceased to exist
Example (C++ - not possible in Java)
Object o = new Object();
...
delete o;
// 1
...
// 2
At 2, the name o is bound to an object that has ceased to exist.
(The technical name for this is a dangling reference).
e. It is also possible for an object to exist without having any name at all bound to it.
Example (Java or C++)
Object o = new Object();
// 1
...
o = new Object();
// 2
At point 2, the object that o was bound to at point 1 still exists, but o no longer is bound
to it. In the absence of any other name bindings to it between points 1 and 2, this object
now has no name referring to it. (The technical name for this is garbage).

Page 10

Principles of Programming Languages

10.Storage Allocation Mechanisms?


Static (permanent) Allocation:- The object exists during the entire time the program is
running.
a. Global variables
- The precise way of declaring global variables differs from language to language
- In some languages (e.g. BASIC) any variable
- In FORTRAN any variable declared in the main program or in
a COMMON block
- In Java, any class field explicitly declared static
- In C/C++, any variable declared outside of a class, or (C++)
declared static inside a class
b. Static local variables - only available in some languages
C/C++ local variables explicitly declared static
int foo() {
static int i;
...
A static local variable retains its value from one call of a routine to another
c. Many constants (but not constants that are actually read-only variables)
Advantages: efficiency (direct addressing),history-sensitive subprogram support (static
variables retain values between calls ofsubroutines).
Disadvantage: lack of flexibility (does notsupport recursion)

Stack Based Allocation:- Storage bindings are created for variables when their declaration
statements are elaborated
Typically, the local variables and parameters of a method have stack lifetime. This name comes
from the normal way of implementing routines, regardless of language.
Since routines obey a LIFO call return discipline, they can be managed by using a stack
composed of stack frames - one for each currently active routine.
Example:
void d() { /* 1 */ }
Page 11

Principles of Programming Languages

void c() { ... d() ... }


void b() { ... c() ... }
void a() { ... b() ... }
int main() { ... a() ... }
Stack at point 1:
------------------| Frame for d | <-- top of stack
------------------| Frame for c |
------------------| Frame for b |
------------------| Frame for a |
------------------| Frame for main |
------------------Even in a language without recursion, it can be advantageous to use a stack for local
variables, rather than allocating them statically. In most programs the pattern of potential
calls among subroutines does not permit all of those subroutines to be active at the same
time. As a result, the total space needed for local variables of currently active subroutines
is seldom as large as the total space across all subroutines, active or not. A stack may
therefore require substantially less memory at run time than would be required for static
allocation
Advantage: allows recursion; conserves storage
Disadvantages:
Overhead of allocation and deallocation
Subprograms cannot be history sensitive
Inefficient references (indirect addressing)
Heap Based Allocation A heap is a region of storage in which subblocks can be allocated and
deallocated at arbitrary times. Heaps are required for the dynamically allocated pieces of linked
data structures, and for object like fully general character strings, lists, and sets, whose size may
change as a result of an assignment statement or other update operation.

Heap

Allocation Req.
Page 12

Principles of Programming Languages

The shaded blocks are in use, the clear blocks are free. Cross hatched space at the ends of in use
blocks represents internal fragmentation. The discontiguous free blocks indicate external
fragmentation.
Internal fragmentation occurs when a storage management algorithm allocates a block that is
larger than required to hold a given object, the extra space is then unused. External fragmentation
occurs when the blocks that have been assigned to active objects are scattered through the heap
in such a way that the remaining, unused space is composed of multiple blocks: there may be
quite a lot of free space, but no one piece of it may be large enough to satisfy some future
request.
As the program runs, Heap space grows as objects are created. However, to prevent growth
without limit, there must be some mechanism for recycling the storage used by objects that are
no longer alive.
A language implementation typically uses one of three approaches to "recycling" space used by
objects that are no longer alive:
Explicit - the program is responsible for releasing space needed by objects that are no
longer needed using some construct such as delete (C++)

Reference counting: each heap object maintains a count of the number of external
pointers/references to it. When this count drops to zero, the space utilized by the object
can be reallocated
Garbage collection: The process of deallocating the memory given to a variable is known
as garbage collection. The system can work in proper order only by proper garbage
collection. C, C++ does not do garbage collection implicitly but java has implicit garbage
collector.
Advantage: Provides for dynamic storage management
Disadvantage: Inefficient (instead of static) and unreliable

11.What are internal and external fragmentations?


A heap is a region of storage in which subblocks can be allocated and deallocated at arbitrary
times. Heaps are required for the dynamically allocated pieces of linked data structures, and for
object like fully general character strings, lists, and sets, whose size may change as a result of an
assignment statement or other update operation.

Heap

Page 13

Principles of Programming Languages

Allocation Req.
The shaded blocks are in use, the clear blocks are free. Cross hatched space at the ends of in use
blocks represents internal fragmentation. The discontiguous free blocks indicate external
fragmentation.
Internal fragmentation occurs when a storage management algorithm allocates a block that is
larger than required to hold a given object, the extra space is then unused. External fragmentation
occurs when the blocks that have been assigned to active objects are scattered through the heap
in such a way that the remaining, unused space is composed of multiple blocks: there may be
quite a lot of free space, but no one piece of it may be large enough to satisfy some future
request.

12.What is garbage collection?


The process of deallocating the memory given to a variable is known as garbage collection (GC).
The system can work in proper order only by proper garbage collection. C, C++ does not do
garbage collection implicitly but java has implicit garbage collector.
The garbage collector, or just collector, attempts to reclaim garbage, or memory occupied by
objects that are no longer in use by the program. Garbage collection was invented by John
McCarthy around 1959 to solve problems in Lisp.
Garbage collection does not traditionally manage limited resources other than memory that
typical programs use, such as network sockets, database handles, user interaction windows, and
file and device descriptors. Methods used to manage such resources, particularly destructors,
may suffice as well to manage memory, leaving no need for GC. Some GC systems allow such
other resources to be associated with a region of memory that, when collected, causes the other
resource to be reclaimed; this is called finalization. Finalization may introduce complications
limiting its usability, such as intolerable latency between disuse and reclaim of especially limited
resources, or a lack of control over which thread performs the work of reclaiming.

13.What is Aliasing?
Two or more names that refer to the same object at the same point in the program are said to be
aliases.
1. Aliases can be created by assignment of pointers/references
Example: Java:
Robot karel = ...
Robot foo = karel;
foo and karel are aliases for the same Robot object
2. Aliases can be created by passing reference parameters
Example: C++
void something(int a [], int & b)
{
// 1
Page 14

Principles of Programming Languages

...
}
int x [100];
int y;
something(x, y);
After the call to something(x, y), at point 1 x and a are aliases for the same array

14.What is Overloading?
A name is said to be overloaded if, in some scope, it has two or more meanings, with the actual
meaning being determined by how it is used.
Example: C++
void something(char x)
...
void something(double x)
...
// 1
At point 1, something can refer to one or the other of the two methods, depending
on its parameter.

15.What is Polymorphism?
Polymorphism is the concept that supports the capability of an object of a class to behave
differently in response to a message or action.
1. Compile-time (static) polymorphism: the meaning of a name is determined at compile-time
from the declared types of what it uses
2. Run-time (dynamic) polymorphism: the meaning of a name is determined when the program is
running from the actual types of what it uses.
Example: Java has both types in different contexts:
a. When a name is overloaded in a single class, static polymorphism is used to determine which
declaration is meant.
b. When a name is overridden in a subclass, dynamic polymorphism is used to determine which
version to use.

Page 15

Principles of Programming Languages

16.What is control flow?


The order in which operations are executed in a program
e.g. in C++ like language,
a = 1;
b = a + 1;
if a > 100 then b = a - 1; else b = a + 1;
a-b+c

17.Name eight major categories of control flow mechanisms?


a. Sequencing: - Statements are to be executed in a certain specified order- usually the order
in which they appear in the program text.
b. Selection: - Depending on some run time condition, a choice is to be made among two or
more statements or expressions. The most common selection condition are if and case
statements. Selection is also sometimes referred to as alternation.
c. Iteration: - A given fragment of code is to be executed repeatedly, either a certain number
of times, or until a certain run- time condition is true. Iteration constructs include for/do,
while, and repeat loops.
d. Procedural abstraction: - A potentially complex collection of control constructs is
encapsulated in a way that allows it to be treated as a single unit, usually subject to
parameterization.
e. Recursion: - An expression is defined in terms of itself, either directly or indirectly; the
computational model requires a stack on which to save information about partially
evaluated instances of the expression. Recursion is usually defined by means of selfreferential subroutines.
f. Concurrency:- Two or more program fragments are to be executed/evaluated at the same
time, either in parallel on separate processors, or interleaved on a single processor in a
way that achieves the same effect.
g. Exception handling and speculation:- A program fragment is executed optimistically, on
the assumption that some expected condition will be true. If that condition turns out to be
false, execution branches to a handler that executes in place of the remainder of the
protected fragment or in place of the entire protected fragment. For speculation, the
language implementation must be able to undo, or roll back any visible effects of the
protected code.
h. Nondeterminacy: - The ordering or choice among statements or expressions is
deliberately left unspecified, implying that any alternative will lead to correct results.
Some languages require the choice to be random, or fair, in some formal sense of the
word.

18.What distinguishes operators from other sort of functions?


An expression generally consists of either a simple object or an operator or function applied to a
collection of operands or arguments, each of which in turn is an expression. It is conventional to
use the term operator for built in functions that use special, simple syntax, and to use the term
operand for an argument of an operator. In most imperative languages, function call consists of a
function name followed by a parenthesized, comma-separated list of arguments, as in
Page 16

Principles of Programming Languages

my_func (A, B, C)
Operators are typically simpler, taking only one or two arguments, and dispensing with the
parentheses and commas:
a+b
-c
In general, a language may specify that function calls employ prefix, infix, or postfix notation.
These terms indicate, respectively, whether the function name appears before, among, or after its
several arguments:
Prefix: op a b
Infix: a op b
Postfix: a b op

19.Explain the difference between prefix, infix, and postfix notation. What
is Cambridge polish notation?
Most imperative languages use infix notation for binary operators and prefix notation for unary
operators and other functions. Lisp uses prefix notation for all notation for all functions.
Cambridge polish notation places the function name inside the parentheses:
( * ( + 1 3) 2)
(append a b c my_list)

20.What is an L- value? An r-value?


Consider the following assignments in C:
d= a;
a= b+c;
In the first statement, the right- hand side of the assignment refers to the value of a, which we
wish to place into d. In the second statement, the left hand side refers to the location of a, where
we want to put the sum of b and c. Both interpretations-value and location-are possible because a
variable in C is a named container for a value. We sometimes say that languages like C use a
value model of variables. Because of their use on the left hand side of assignment statements,
expressions that denote locations are referred to as L-values. Expressions that denote values are
referred to as r- values. Under a value model of variables, a given expression can be either an Lvalue or an r-value, depending on the context in which it appears.
Of course, not all expressions can be L-values, because not all values have a location, and not all
names are variables. In most languages it makes no sense to say 2+3 =a, or even a= 2+3, if a is
the name of a constant. By the same token, not all L-values are simple names; both L-values and
r-values can be complicated expressions. In C one may write
(f (a) +3) -> b [c]=2;
In this expression f (a) returns a pointer to some element of an array of pointers to structures. The
assignment places the value 2 into the c-th element of field b of the structure pointed at by the
third array element after the one to which fs return value points.
In C++ it is even possible for a function to return a reference to a structure, rather than a pointer
to it , allowing one to write
g(a) . b[c]= 2;

Page 17

Principles of Programming Languages

21.Define orthogonality in the context of programming language design?


One of the principal design goals of Algol 68 was to make the various features of the languages
as orthogonal as possible. Orthogonality means that features can be used in any combination, the
combinations all make sense, and the meaning of a given feature is consistent, regardless of the
other features with which it is combined.
Algol 68 was one of the first languages to make orthogonality a principal design goal, and in fact
few languages since have given the goal such weight. Among other things, Algol 68 is said to be
expression oriented: it has no separate notion of statement. Arbitrary expressions can appear in
contexts that would call for a statement in a language like Pascal, and constructs that are
considered to be statements in other languages can appear within expressions. The following, for
example is valid in Algol 68:
begin
a:=if b < c then d else e;
a:= begin f(b); g(c) end;
g(d);
2+3
End

22. Expression Evaluation

An expression consists of
o A simple object, e.g. number or variable
o An operator applied to a collection of operands or arguments which are
expressions
Common syntactic forms for operators:
o Function call notation, e.g. somefunc(A, B, C), where A, B, and C are expressions
o Infix notation for binary operators, e.g. A + B
o Prefix notation for unary operators, e.g. -A
o Postfix notation for unary operators, e.g. i++
o Cambridge Polish notation, e.g. (* (+ 1 3) 2) in Lisp =(1+3)*2=8

Expression Evaluation Ordering: Precedence and Associativity:Precedence rules specify that certain operators, in the absence of parentheses, group more
tightly than other operators. In most languages multiplication and division group more tightly
than addition and subtraction, so 2+3*4 is 14 and not 20.
In Java all binary operators except assignments are left associative
3-3+5
x = y = f()
(assignments evaluate to the value being assigned)
In C++ arithmetic operators (+, -, *, ...) have higher precedence than relational operators (<, ...)
x+y<z+w
x + y == z

Page 18

Principles of Programming Languages

The use of infix, prefix, and postfix notation leads to ambiguity as to what is an operand
of what, e.g. a+b*c**d**e/f in Fortran
The choice among alternative evaluation orders depends on
o Operator precedence: higher operator precedence means that a (collection of)
operator(s) group more tightly in an expression than operators of lower
precedence
o Operator associativity: determines evaluation order of operators of the same
precedence
left associative: operators are evaluatated left-to-right (most common)
right associative: operators are evaluated right-to-left (Fortran power
operator **, C assignment operator = and unary minus)
non-associative: requires parenthesis when composed (Ada power
operator **)

Evaluation Order in Expressions:

Precedence and associativity define rules for structuring expressions


But do not define operand evaluation order
o Expression a-f(b)-c*d is structured as (a-f(b))-(c*d) by compiler, but either (af(b)) or (c*d) can be evaluated first at run-time
Knowing the operand evaluation order is important
o Side effects: e.g. if f(b) above modifies d (i.e. f(b) has a side effect) the expression
value will depend on the operand evaluation order
o Code improvement: compilers rearrange expressions to maximize efficiency
Improve memory loads:
a:=B[i]; load a from memory
c:=2*a+3*d; compute 3*d first, because waiting for a to arrive in
processor
Common subexpression elimination:
a:=b+c;
d:=c+e+b; rearranged as d:=b+c+e, it can be rewritten into d:=a+e

Expression Reordering Problems

Rearranging expressions may lead to arithmetic overflow or different floating point


results
o Assume b, d, and c are very large positive integers, then if b-c+d is rearranged
into (b+d)-c arithmetic overflow occurs
o Floating point value of b-c+d may differ from b+d-c
o Most programming languages will not rearrange expressions when parenthesis are
used, e.g. write (b-c)+d to avoid problems
Java: expressions evaluation is always left to right and overflow is always detected
Pascal: expression evaluation is unspecified and overflows are always detected
C and C++: expression evaluation is unspecified and overflow detection is
implementation dependent
Page 19

Principles of Programming Languages

Short-Circuit Evaluation

Short-circuit evaluation of Boolean expressions means that computations are skipped


when logical result of a Boolean operator can be determined from the evaluation of one
operand
C, C++, and Java use conditional and/or operators: && and ||
o If a in a&&b evaluates to false, b is not evaluated
o If a in a||b evaluates ot true, b is not evaluated
o Avoids the Pascal problem
o Useful to increase program efficiency, e.g.
if (unlikely_condition && expensive_condition()) ...
Pascal does not use short-circuit evaluation
o The program fragment below has the problem that element a[11] can be accessed
resulting in a dynamic semantic error:
o var a:array [1..10] of integer;
...
i:=1;
while i<=10 and a[i]<>0 do
i:=i+1
Ada conditional and/or uses then keyword, e.g.: cond1 and then cond2

Ada, C, and C++ also have regular Boolean operators


Assignments:

Assignment a:=b
o Left-hand side a of the assignment is a location, called l-value which is an
expression that should denote a location
o Right-hand side b of the assignment is a value, called r-value which is an
expression
o Languages that adopt value model of variables copy values: Ada, Pascal, C, C++
copy the value of b into the location of a
o Languages that adopt reference model of variables copy references: Clu copies
the reference of b into a and both a and b refer to the same object

Variable initialization
o Implicit: e.g. 0 or NaN (not a number) is assigned by default
o Explicit: by programmer (more efficient than using an explicit assignment, e.g. int
i=1; declares i and initializes it to 1 in C)
o Use of uninitialized variable is source of many problems, sometimes compilers
are able to detect this but cannot be detected in general
Combination of assignment operators
o In C/C++ a+=b is equivalent to a=a+b (but a[i++]+=b is different from
a[i++]=a[i++]+b)

Page 20

Principles of Programming Languages


o

Compiler produces better code, because the address of a variable is only


calculated once
Multiway assignments in Clu, ML, and Perl
o a,b := c,d assigns c to a and d to b simultaneously, e.g. a,b := b,a swaps a with b
a,b := 1 assigns 1 to both a and b

23.What is short circuit Boolean evaluation? Why is it useful?


Short-circuit evaluation of Boolean expressions means that computations are skipped when
logical result of a Boolean operator can be determined from the evaluation of one operand
C, C++, and Java use conditional and/or operators: && and ||
a.
b.
c.
d.

If a in a&&b evaluates to false, b is not evaluated


If a in a||b evaluates ot true, b is not evaluated
Avoids the Pascal problem
Useful to increase program efficiency, e.g.
if (unlikely_condition && expensive_condition()) ...

Pascal does not use short-circuit evaluation


e. The program fragment below has the problem that element a[11] can be accessed
resulting in a dynamic semantic error:
f. var a:array [1..10] of integer;
...
i:=1;
while i<=10 and a[i]<>0 do
i:=i+1
Ada conditional and/or uses then keyword, e.g.: cond1 and then cond2
Ada, C, and C++ also have regular Boolean operators

24.Structured and Unstructured Flow


Unstructured flow: the use of goto statements and statement labels to obtain control flow
a. Useful for jumping out of nested loops and for programming errors and
exceptions
b. Java has no goto statement
Structured flow:

Sequencing: the subsequent execution of a list of statements in that order


Selection: if-then-else statements and switch or case-statements
Page 21

Principles of Programming Languages

Iteration: for and while loop statements


Subroutine calls and recursion

Sequencing
One statement appearing after another

o
o
o
o

- A list of statements in a program text is executed in top-down order


- A compound statement is a delimited list of statements
A compund statement is a block when it includes variable declarations
C, C++, and Java use { and } to delimit a block
Pascal and Modula use begin ... end
Ada uses declare ... begin ... end

C, C++, and Java: expressions can be used where statements can appear

In pure functional languages, sequencing is impossible (and not desired!)

Selection
Selects which statements to execute next

Forms of if-then-else selection statements:


o C and C++ EBNF syntax:
if (<expr>) <stmt> [else <stmt>]
Condition is integer-valued expression. When it evaluates to 0, the else-clause
statement is executed otherwise the then-clause statement is executed. If more
than one statement is used in a clause, grouping with { and } is required
o Java syntax is like C/C++, but condition is Boolean type
o Ada syntax allows use of multiple elsif's to define nested conditions:

if <cond> then
<statements>
elsif <cond> then
<statements>
elsif <cond> then
<statements>
...
else
<statements>
end if

Page 22

Principles of Programming Languages

Case/switch statements are different from if-then-else statements in that an expression


can be tested against multiple constants to select statement(s) in one of the arms of the
case statement:
o C, C++, and Java syntax:
switch (<expr>)
{ case <const>: <statements> break;
case <const>: <statements> break;
...
default: <statements>
}
o break is necessary to transfer control at the end of an arm to the end of the switch
statement

The use of a switch statement is much more efficient compared to nested if-then-else statements

Iteration
Iteration means the act of repeating a process usually with the aim of approaching a desired goal
or target or result. Each repetition of the process is also called an "iteration," and the results of
one iteration are used as the starting point for the next iteration.
A conditional that keeps executing as long as the condition is true
e.g: while, for, loop, repeat-until, ...
Iteration and recursion are the two mechanisms that allow a computer to perform similar
operations repeatedly. Without at least one of these mechanisms, the running time of a program
would be a linear function of the size of the program text. In a very real sense, it is iteration and
recursion that make computer useful.
Enumeration-Controlled Loops
Enumeration controlled iteration originated with the do loop of Fortran I. Similar mechanisms have been
adopted in some form by almost every subsequent language, but syntax and semantics vary widely.

Fortran-IV:
DO 20 i = 1, 10, 2
...
20 CONTINUE
which is defined to be equivalent to
i=1
20 ...
i=i+2
IF i.LE.10 GOTO 20

Algol-60 combines logical conditions:


Page 23

Principles of Programming Languages


o

for <id> := <forlist> do <stmt>


where the EBNF syntax of <forlist> is
<forlist> -> <enumerator> [, enumerator]*
<enumerator> -> <expr>
| <expr> step <expr> until <expr>
| <expr> while <cond>

Difficult to understand and too many forms that behave the same:
for i := 1, 3, 5, 7, 9 do ...
for i := 1 step 2 until 10 do ...
for i := 1, i+2 while i < 10 do ...

Pascal has simple design:


o for <id> := <expr> to <expr> do <stmt>
for <id> := <expr> downto <expr> do <stmt>
o Can iterate over any discrete type, e.g. integers, chars, elements of a set
o Index variable cannot be assigned and its terminal value is undefined

Ada for loop is much like Pascal's:


o for <id> in <expr>..<expr> loop
<statements>
end loop
for <id> in reverse <expr>..<expr> loop
<statements>
end loop
o Index variable has a local scope in loop body, cannot be assigned, and is not
accessible outside of the loop

C, C++, and Java do not have enumeration-controlled loops although the logicallycontrolled for statement can be used to create an enumeration-controlled loop:
o for (i = 1; i <= n; i++) ...
Iterates i from 1 to n by testing i <= n before each iteration and updating i by 1
after each iteration
o Programmer's responsability to modify, or not to modify, i and n in loop body

C++ and Java also allow local scope for index variable, for example
for (int i = 1; i <= n; i++) ...

Page 24

Principles of Programming Languages

Problems with Enumeration-Controlled Loops:

C/C++:
o This C program never terminates:
#include <limits.h>
main()
{ int i;
for (i = 0; i <= INT_MAX; i++)
...
}
because the value of i overflows (INT_MAX is the maximum positive value int
can hold) after the iteration with i==INT_MAX and i becomes a large negative
integer
o In C/C++ it is easy to make a mistake by placing a ; at the end of a while or for
statement, e.g. the following loop never terminates:
i = 0;
while (i < 10);
{ i++; }

Fortran-77
o The C/C++ overflow problem is avoided by calculating the number of iterations
in advance
o However, for REAL typed index variables an exception is raised when overflow
occurs

Logically-Controlled Pretest Loops:

Logically-controlled pretest loops test an exit condition before each loop iteration
Not available Fortran-77 (!)
Pascal:
o while <cond> do <stmt>
where the condition is a Boolean expression and the loop will terminate when the
condition is false. Multiple statements need to be enclosed in begin and end
C, C++:
o while (<expr>) <stmt>
where the loop will terminate when expression evaluates to 0 and multiple
statements need to be enclosed in { and }

Java is like C++, but condition is Boolean expression

Page 25

Principles of Programming Languages

Logically-Controlled Post test Loops:

Logically-controlled post test loops test an exit condition after each loop iteration
Not available in Fortran-77 (!)
Pascal:
o repeat <stmt> [; <stmt>]* until <cond>
where the condition is a Boolean expression and the loop will terminate when the
condition is true
C, C++:
o do <stmt> while (<expr>)
where the loop will terminate when the expression evaluates to 0 and multiple
statements need to be enclosed in { and }

Java is like C++, but condition is a Boolean expression

Logically-Controlled Mid test Loops:

Logically-controlled mid test loops test exit conditions within the loop
Ada:
o loop
<statements>
exit when <cond>;
<statements>
exit when <cond>;
<statements>
...
end loop
o Also allows exit of outer loops using labels:
outer: loop
...
for i in 1..n loop
...
exit outer when cond;
...
end loop;
end outer loop;
C, C++:
o Use break statement to exit loops
o Use continue to jump to beginning of loop to start next iteration

Java is like C++, but combines Ada's loop label idea to allow jumps to outer loops

Page 26

Principles of Programming Languages

Recursion
When a function may directly or indirectly call itself
Can be used instead of loops
Functional languages frequently have no loops but only recursion

Iteration and recursion are equally powerful: iteration can be expressed by recursion and
vice versa
Recursion can be less efficient, but most compilers for functional languages will optimize
recursion and are often able to replace it with iterations
Recursion can be more elegant to use to solve a problem that is recursively defined

Tail Recursive Functions

Tail recursive functions are functions in which no computations follow a recursive call in
the function
A recursive call could in principle reuse the subroutine's frame on the run-time stack and
avoid deallocation of old frame and allocation of new frame
This observation is the key idea to tail-recursion optimization in which a compiler
replaces recursive calls by jumps to the beginning of the function

For the gcd example, a good compiler will optimize the function into:
int gcd(int a, int b)
{ start:
if (a==b) return a;
else if (a>b) { a = a-b; goto start; }
else { b = b-a; goto start; }
}
which is just as efficient as the iterative implementation of gcd:
int gcd(int a, int b)
{ while (a!=b)
if (a>b) a = a-b;
else b = b-a;
return a;
}

Continuation-Passing-Style:

Even functions that are not tail-recursive can be optimized by compilers for functional
languages by using continuation-passing style:
o With each recursive call an argument is included in the call that is a reference
(continuation function) to the remaining work

Page 27

Principles of Programming Languages

The remaining work will be done by the recursively called function, not after the call, so the
function appears to be tail-recursive

Other Recursive Function Optimizations

Another function optimization that can be applied by hand is to remove the work after the
recursive call and include it in some other form as an argument to the recursive call
For example:
typedef int (*int_func)(int);
int summation(int_func f, int low, int high)
{ if (low==high) return f(low)
else return f(low)+summation(f, low+1, high);
}
can be rewritten into the tail-recursive form:
int summation(int_func f, intlow, int high, int subtotal)
{ if (low==high) return subtotal+f(low)
else return summation(f, low+1, high, subtotal+f(low));
}

This example in Scheme:


(define summation (lambda (f low high)
(if (= low high) ;condition
(f low) ;then part
(+ (f low) (summation f (+ low 1) high))))) ;else-part rewritten:
(define summation (lambda (f low high subtotal)
(if (=low high)
(+ subtotal (f low))
(summation f (+ low 1) high (+ subtotal (f low))))))

Nondeterminacy:
Our final category of control flow is nondeterminacy. A nondeterministic construct is one in
which the choice between alternatives is deliberately unspecified. Some languages, notably
Algol 68 and various concurrent languages, provide more extensive nondeterministic
mechanisms, which cover statements as well.

25.Data Types
Most programming languages require the programmer to declare the data type of every data
object, and most database systems require the user to specify the type of each data field. The
available data types vary from one programming language to another, and from one database
application to another, but the following usually exist in one form or another:

Page 28

Principles of Programming Languages

integer : In more common parlance, whole number; a number that has no fractional
part.
floating-point : A number with a decimal point. For example, 3 is an integer, but 3.5
is a floating-point number.
character (text ): Readable text

26.What purpose do types serve in a programming language?


a. Types provide implicit context for many operations, so that the programmer does
not have to specify that context explicitly. In C, for instance, the expression a+b
will use integer addition if a and b are of integer type, it will use floating point
addition if a and b are of double type.
b. Types limit the set of operations that may be performed in a semantically valid
program. They prevent the programmer from adding a character and a record, for
example, or from taking the arctangent of a set, or passing a file as a parameter to
a subroutine that expects an integer.

27.Discuss about Type Systems


A type system consist of (1) a mechanism to define types and associate them with certain
language constructs, and (2) a set of rules for type equivalence, type compatibility, and type
inference. The constructs that must have types are precisely those that have values or that can
refer to objects that have values. These constructs include named constants, variables, record
field, parameters, and sometimes subroutines; literal constants. Type equivalence rules determine
(a) when the types of two values are the same. Type compatibility rules determine (a) when a
value of a given type can be used in a given context. Type inference rules define the type of an
expression based on the types of its constituent parts or the surrounding context.
Type checking:Type checking is the process of ensuring that a program obeys the languages type compatibility
rules. A violation of the rules is known as a type clash. A language is said to be strongly typed if
it prohibits, in a way that the language implementation can enforce, the application of any
operation to any object that is not intended to support that operation. A language is said to be
statically typed if it is strongly typed and type checking can be performed at compile time.
Ex: Ada is strongly typed and for the most part statically typed. A Pascal implementation can
also do most of its type checking at compile time, though the language is not quite strongly
typed: untagged variant records are its only loophole.
Dynamic type checking is a form of late binding, and tends to be found in languages that delay
other issues until run time as well. Lisp and smalltalk are dynamically typed. Most scripting
languages are also dynamically typed; some (Python, Ruby) are strongly typed.

Page 29

Principles of Programming Languages

Classification of Types:The terminology for types varies some from one language to another. Most languages provide
built in types similar to those supported in hardware by most processors: integers, characters,
Boolean, and real (floating point) numbers.
Booleans are typically implemented as single byte quantities with 1 representing true and 0
representing false.
Characters have traditionally been implemented as one byte quantities as well, typically using the
ASCII encoding. More recent languages use a two byte representation designed to accommodate
the Unicode character set.
Numeric Types:A few languages (C, Fortran) distinguish between different lengths of integers and real numbers;
most do not, and leave the choice of precision to the implementation. Unfortunately, differences
in precision across language implementations lead to a lack of portability: programs that run
correctly on one system may produce run-time errors or erroneous results on another.
A few languages, including C,C++,C# and Modula-2,provide both signed and unsigned integers.
A few languages (Fortran,C99,Common Lisp) provide a built in complex type, usually
implemented as a pair of floating point numbers that represent the real and imaginary Cartesian
coordinates; other languages support these as a standard library class. Most scripting languages
support integers of arbitrary precision. Ada supports fixed point types, which are represented
internally by integers. Integers, Booleans, characters are examples of discrete types.
Enumeration Types:Enumerations were introduced by Wirth in the design of Pascal. They facilitate the creation of
readable programs, and allow the compiler to catch certain kinds of programming errors. An
enumeration type consists of a set of named elements. In Pascal one can write:
Type weekday=(sun, mon, tue, ordered, so comparisons are generally valid(mon<tue),and
there is usually a mechanism to determine the predecessor or successor of an enumeration
value(in Pascal, tomorrow :=succ (today).
Values of an enumeration type are typically represented by small integers, usually a consecutive
range of small integers starting at zero. In many languages these ordinal values are semantically
significant, because built in functions can be used to convert an enumeration value to its ordinal
value, and sometimes vice versa.
Several languages allow the programmer to specify the ordinal values of enumeration types, if
the default assignment is undesirable.
Subrange Types:Like enumerations, subranges were first introduced in Pascal, and are found in many subsequent
languages. A subrange is a type whose values compose a contiguous subset of the values of some
discrete base type.In Pascal subranges look like this:
Type test_score=0..100;
Workday= mon..fri;
Page 30

Principles of Programming Languages

In Ada one would write


Type test_score is new integer range 0..100;
Subtype workday is weekday range mon..fri;
The range portion of the definition in Ada is called a type constraint.
test_score is a derived type,incompatible with integers.
The workday can be more or less freely intermixed.
Composite Types:Nonscalar types are usually called composite, or constructed types. They are generally created by
applying a type constructor to one or more simpler types. Common composite types include
records, variant records, arrays, sets, pointers, lists, and files.
Records-Introduced by Cobol, and have been supported by most languages since
the 1960s.A record consists of collection of fields, each of which belongs to a
simpler type.
Variant records-It differs from normal records in that only one of a variant records
field is valid at any given time.
Arrays-Are the most commonly used composite types. An array can be thought of
as a function that maps members of an index type to members of a component
type.
Sets-Introduced by Pascal. A set type is the mathematical powerset of its base
type, which must often be discrete.
Pointers-A pointer value is a reference to an object of the pointers base type.
Pointers are often but not always implemented as addresses. They are most often
used to implement recursive data types
Lists-Contain a sequence of elements, but there is no notion of mapping or
indexing. Rather, a list is defined recursively as either an empty list or a pair
consisting of a head element and a reference to a sublist. To find a given element
of a list ,a program must examine all previous elements, recursively or iteratively,
starting at the head. Because of their recursive definition, lists are fundamental to
programming in most functional languages.
Files-Are intended to represent data on mass storage devices, outside the memory
in which other program objects reside.

28.Discuss about Type checking


In most statically typed languages, every definition of an object must specify the objects type.
Type equivalence:In a language in which the user can define new types, there are two principal ways of defining
type equivalence. Structural equivalence is based on the content of type definitions. Name
equivalence is based on the lexical occurrence of type definitions. Structural equivalence is used
in Algol-68,Modula-3,C.Name equivalence is the more popular approach in recent languages. It
is used in Java, C#, standard Pascal and Ada
Page 31

Principles of Programming Languages

The exact definition of structural equivalence varies from one language to another.
Structural equivalence in Pascal:
Type R2=record
a,b : integer
end;
should probably be considered the same as
type R3 = record
a : integer;
b : integer
end;
But what about
Type R4 = record
b : integer;
a : integer
end;
should the reversal of the order of the fields change the type? Most languages say yes.
In a similar vein,consider the following arrays,again in a Pascal like notation:
type str = array [1..10] of char;
type str = array [09] of char;
Here the length of the array is the same in both cases, but the index values are different. Should
these be considered equivalent? Most languages say no, but some (Fortran, Ada) consider them
compatible.
Type compatibility:Most languages do not require equivalence of types in every context. Instead, they merely say
that a values type must be compatible with that of the context in which it appears. In an
assignment statement, the type of the right hand side must be compatible with that of the lefthand side. The types of the operands of + must both be compatible with some common type that
supports addition .In a subroutine call, the types of any arguments passed into the subroutine
must be compatible with the types of the corresponding formal parameters, and the types of any
formal parameters passed back to the caller must be compatible with the types of the
corresponding arguments.
Coercion:Whenever a language allows a value of one type to be used in a context that expects another, the
language implementation must perform an automatic, implicit conversion to the expected type.
This conversion is called a type coercion.

Page 32

Principles of Programming Languages

29.Discuss about Records(Structures) and Variants(Unions)


Record types allow related data of heterogeneous types to be stored and manipulated together.
Some languages (Algol 68, C,C++,Common Lisp) use the term structure instead of record.
Fortran 90 simply calls its records types. Structures in C++ are defined as a special form of
class.
Syntax and Operations:In c a simple record might be defined as follows.
struct element {
char name[2];
int atomic_number;
double atomic_weight;
_Bool metallic;
};
In Pascal the corresponding declarations would be
Type two_chars = packed array [1..2] of char;
Type element = record
name : two_chars;
atomic_number : integer;
atomic_weight : real;
metallic : Boolean
end;
Memory layout and its impact:
The fields of a record are usually stored in adjacent locations in memory. In its symbol table, the
compiler keeps track of the offset of each field within each record type. When it needs to access
a field, the compiler typically generates a load or store instruction with displacement addressing.
For a local object, the base register is the frame pointer; the displacement is the sum of the
records offset from the register and the fields offset within the record.
Variant Records (Unions):
Programming language of the 1960s and 1970s were designed in an era of severe memory
constraints. Many allowed the programmer to specify that certain variables should be allocated
on top of one another, sharing the same bytes in memory. Cs syntax heavily influenced by
Algol 68.
Union {
int i;
double d;
_Bool b;
};

Page 33

Principles of Programming Languages

In practice, unions have been used for two main purposes. The first arises in systems programs,
where unions allow the same set of bytes to be interpreted in different ways at different times.
The canonical example occurs in memory management, where storage may sometimes be treated
as unallocated space, sometimes as bookkeeping information, and sometimes as user allocated
data of arbitrary type.
The second common purpose for unions is to represent alternative sets of fields within a record.
A record representing an employee, for example, might have several common fields (name,
address, phone, department) and various other fields such as salaried, hourly or consulting basis.

30.Briefly describe two purposes for unions/ variant records


Programming language of the 1960s and 1970s were designed in an era of severe memory
constraints. Many allowed the programmer to specify that certain variables should be allocated
on top of one another, sharing the same bytes in memory. Cs syntax heavily influenced by
Algol 68.
union {
int i;
double d;
_Bool b;
};
In practice, unions have been used for two main purposes. The first arises in systems programs,
where unions allow the same set of bytes to be interpreted in different ways at different times.
The canonical example occurs in memory management, where storage may sometimes be treated
as unallocated space, sometimes as bookkeeping information, and sometimes as user allocated
data of arbitrary type.
The second common purpose for unions is to represent alternative sets of fields within a record.
A record representing an employee, for example, might have several common fields (name,
address, phone, department) and various other fields such as salaried, hourly or consulting basis.

31.What is an Array?
Arrays are the most common and important composite data types. They have been a fundamental
part of almost every high-level language, beginning with FortranI. Unlike records, which group
related fields of disparate types, arrays are usually homogeneous. Semantically, they can be
thought of as a mapping from an index type to a component or element type.
Syntax and operations:
Most languages refer to an element of an array by appending a subscript delimited by
parentheses or square brackets-to the name of the array. In Fortran and Ada ,one says A(3);in
Pascal and C, one says A[3].Since parentheses are generally used to delimit the arguments to a
subroutine call, square bracket subscript notation has the advantage of distinguishing between
the two. The difference in notation makes a program easier to compile and arguably easier to
read.
Declarations:
In some languages one declares an array by appending subscript notation to the syntax that
would be used to declare a scalar. In C:
Page 34

Principles of Programming Languages

Char upper[26];
In Fortran:
character, dimension (1:26)::upper
character (26) upper //shorthand notation
In C the lower bound of an index range is always zero; the indices of an n-element array are
0n-1. In Fortran the lower bound of the index range is one by default.

32.What is an Array Slices


A slice or section is a rectangular portion of an array. Fortran 90 and single assignment C
provide extensive facilities for slicing, as do many scripting languages including Perl, Python,
Ruby and R. A slice is simply a contiguous range of elements in a one- dimensional array.
Fortran 90 has a very rich set of array operations: built in operations that take entire arrays as
arguments. Because Fortran uses structural type equivalence, the operands of an array operator
need have the same element type and shape. In particular, slices of the same shape can be
intermixed in array operations, even if the arrays from which they were sliced have very different
shapes. Any of the built in arithmetic operators will take arrays as operands; the result is an
array,of the same shape as the operands, whose elements are the result of applying the operator
to corresponding elements.
Array slices (sections) in Fortran 90.

[ a : b : c in a subscript indicates positions a, a + c, . . . through b. If a or b is omitted, the


corresponding bound is assumed. If c is omitted, 1 is assumed. If c is negative, then we select
positions in reverse order. The slashes in the second subscript of the lower-right example delimit
an explicit list of positions.]
Page 35

Principles of Programming Languages

33.How arrays are allocated?


Storage management is more complex for arrays whose shape is not known until elaboration
time, or whose shape may change during execution. For these the compiler must arrange not only
to allocate space, but also to make shape information available at run time. Some dynamically
typed languages allow run time binding of both the number and bounds of dimensions. Compiled
languages may allow the bounds to be dynamic, but typically require the number of dimensions
to be static. A local array whose shape is known at elaboration time may still be allocated in the
stack. An array whose size may change during execution must generally be allocated in the heap.

Global lifetime, static shape: allocate space for the array in static global memory
Local lifetime, static shape: space can be allocated in the subroutines stack frame at run
time
Local lifetime, shape bound at elaboration time: an extra level of indirection is required
to place the space for the array in the stack frame of its subroutine (Ada, C)
Arbitrary lifetime, shape bound at elaboration time: at elaboration time either space is
allocated or a preexistent reference from another array is assigned (Java, C#)
Arbitrary lifetime, dynamic shape: must generally be allocated from the heap. A pointer
to the array still resides in the fixed-size portion of the stack frame (if local lifetime).

Allocation in Ada of local arrays whose shape is bound at elaboration time


//Ada:
Procedure foo (size : integer ) is
M : array (1size, 1..size) of real;
..
begin
..
end foo;

//C99:
void foo (int size)
{
double M[size][size];

Page 36

Principles of Programming Languages

[The compiler arranges for a pointer to M to reside at a static offset from the frame pointer. M
cannot be placed among the other local variables because it would prevent those higher in the
frame from having static offsets.]

34.Discuss about Memory Layout of Arrays


Arrays in most language implementations are stored in contiguous locations in memory. In a one
dimensional array the second element of the array is stored immediately after the first; the third
is stored immediately after the second, and so forth. For arrays of records, it is common for each
subsequent element to be aligned at an address appropriate for any type; small holes between
consecutive records may result.
For multidimensional arrays, there are two layouts: row-major order and column-major order
In row-major order, consecutive locations in memory hold elements that differ by one in
the final subscript (except at the ends of rows).

In column-major order, consecutive locations hold elements that differ by one in the
initial subscript

Page 37

Principles of Programming Languages

[ In row major order, the elements of a row are contiguous in memory; in column-major order,
the elements of a column are contiguous. The second cache line of each array is shaded, on the
assumption that each element is an eight-byte floating point number, that cache lines are 32 bytes
long and that the array begins at a cache line boundary. If the array is indexed from A[0,0] to
A[9,9],then in the row major case elements A[0,4] through A[0,7] share a cache line; in the
column-major case elements A[4,0] through A[7,0] share a cache line.]
Row-Pointer Layout:
Allow the rows of an array to lie anywhere in memory, and create an auxiliary array of
pointers to the rows.
Technically speaking, only the contiguous layout is a true multidimensional array
This row-pointer memory layout requires more space in most cases but has three
potential advantages.
It sometimes allows individual elements of the array to be accessed more quickly,
especially on CISC machines with slow multiplication instructions
It allows the rows to have different lengths, without devoting space to holes at the
ends of the rows; the lack of holes may sometimes offset the increased space for
pointers
It allows a program to construct an array from preexisting rows (possibly
scattered throughout memory) without copying
C, C++, and C# provide both contiguous and row-pointer organizations for
multidimensional arrays
Java uses the row-pointer layout for all arrays
Row-Pointer Layout in C:
char days [ ][10]={
Sunday, Monday, Tuesday, Wednesday, Thursday, Friday, Saturday
};
Page 38

Principles of Programming Languages

.
days [2] [3] = =s; /*in Tuesday */

[ It is a true two dimensional array. The slashed boxes are NUL bytes; the shaded areas are
holes.]
char *days [ ] = {
Sunday, Monday, Tuesday, Wednesday, Thursday, Friday, Saturday
};
..
days [2] [3] = = s; /* in Tuesday */

[ It is a ragged array of pointers to arrays of characters.]


Page 39

Principles of Programming Languages

35.What is a dope vector? What purposes does it serve?


A dope vector will contain the lower bound of each dimension and the size of each dimension
other than the last. If the language implementation performs dynamic semantic checks for out of
bounds subscripts in array references, then the dope vector may contain upper bounds as well.
Given lower bounds and sizes, the upper bound information is redundant, but it is usually
included anyway, to avoid computing it repeatedly at run time.
The contents of the dope vector are initialized at elaboration time, or whenever the number or
bounds of dimensions change. In a language like Fortran 90, whose notion of shape includes
dimension sizes but not lower bounds, an assignment statement may need to copy not only the
data of an array, but dope vector contents as well.
In a language that provides both a value model of variables and arrays of dynamic shape, we
must consider the possibility that a record will contain a field whose size is not statically known.
In this case the compiler may use dope vectors not only for dynamic shape arrays, but also for
dynamic shape records. The dope vector for a record typically indicates the offset of each field
from the beginning of the record.

36.Discuss about String representations in programming languages


In many languages, a string is simply an array of characters. In other languages, strings
have special status, with operations that are not available for arrays of other sorts.
It is easier to provide special features for strings than for arrays in general because strings
are one-dimensional
Manipulation of variable-length strings is fundamental to a huge number of computer
applications
Particularly powerful string facilities are found in various scripting languages such as
Perl, Python and Ruby.
C, Pascal, and Ada require that the length of a string-valued variable be bound no later
than elaboration time, allowing contiguous space allocation in the current stack frame
Lisp, Icon, ML, Java, C# allow the length of a string-valued variable to change over its
lifetime, requiring that space be allocated by a block or chain of blocks in the heap

37.Sets
A set is an unordered collection of an arbitrary number of distinct values of a common
type
Introduced by Pascal, and are found in many more recent languages as well. Pascal
supports sets of any discrete type, and provides union, intersection, and difference
operations:
var A,B,C :set of char;
D,E : set of weekday;
.
A := B + C;
A := B * C;
Page 40

Principles of Programming Languages

A := B - C;
Many ways to implement sets, including arrays, hash tables, and various forms of trees
The most common implementation employs a bit vector whose length (in bits) is the
number of distinct values of the base type
Operations on bit-vector sets can make use of fast logical instructions on most machines.
Union is bit-wise or; intersection is bit-wise and; difference is bit-wise not, followed by
bit-wise and

38.Discuss the tradeoffs between Pointers and the Recursive Types that
arise naturally in a language with a reference model of variables.
A recursive type is one whose objects may contain one or more references to other objects of the
type. Most recursive types are records, since they need to contain something in addition to the
reference, implying the existence of heterogeneous fields. Recursive types are used to build a
wide variety of linked data structures, including lists and trees.
In some languages (Pascal, Ada 83,Modula-3) pointers are restricted to point only to objects in
the heap. The only way to create a new pointer value is to call a built-in function that allocates a
new object in the heap and returns a pointer to it. In other languages ( PL/I, Algol 68, C, C++,
Ada 95) one can create a pointer to a nonheap object by using an address of operator.
Syntax and Operations:Operations on pointers include allocation and deallocation of objects in the heap, dereferencing
of pointers to access the objects to which they point, and assignment of one pointer into another.
The behavior of these operations depends heavily on whether the language is functional or
imperative and on whether it employs a reference or value model for variables.
Functional languages generally employ a reference model for names. Objects in a functional
language tend to be allocated automatically as needed, with a structure determined by the
language implementation. Variables in an imperative language may use either a value or a
reference model, or some combination of the two. In C, Pascal, or Ada which employ a value
model, the assignment A: = B puts the value of B into A. If we want B to refer to an object and
we want A: = B to make A refer to the object to which B refers, then A and B must be pointers.
Reference Model:
In Lisp, which uses a reference model of variables but is not statically typed, tree could be
specified textually as (# \ R (# \X ( ) ( ) ) ( # \ Y (# \ Z ( ) ( ) ) (# \ W ( ) ( ) ))).

Page 41

Principles of Programming Languages

[Implementation of a tree in Lisp, A diagonal slash through a box indicates a null pointer. The C
and A tags serve to distinguish the two kinds of memory blocks:cons cells and blocks containing
atoms ]
In Pascal tree data types would be declared as follows:
type chr_tree_ptr = ^chr_tree;
chr_tree = record
left,right : chr_tree_ptr;
val : char
end;
In Ada:
type chr_tree;
type chr_tree_ptr is access chr_tree;
type chr_tree is record
left,right : chr_tree_ptr;
val : character;
end record;
In C:
struct chr_tree
{
struct chr_tree * left, *right;
char val;
};

39.What are Dangling References? How are they created?


When a heap allocated object is no longer live, a long running program needs to reclaim the
objects space. Stack objects are reclaimed automatically as part of the subroutine calling
sequence. There are two alternatives to reclaime heap objects. Languages like Pascal, C, and
C++ require the programmer to reclaim an object explicitly.
In Pascal:
dispose(my_ptr);
In C:
free (my_ptr);
In C++:
delete my_ptr;
A dangling reference is a live pointer that no longer points to a valid object.
Dangling reference to a stack variable in C++:
int i=3;
int *p = &i;
Page 42

Principles of Programming Languages

.
void foo( )
{
int n=5;
p=&n;
}
..
cout<<*p; //prints 3
foo( );
.
cout<< *p; //Undefined behavior: n is no longer live
In a language with explicit reclamation of heap objects, a dangling reference is created whenever
the programmer reclaims an object to which pointers still refer:
int *p = new int;
*p =3;
.
cout << *p;
// prints 3
delete p;
.
cout << *p; // undefined behavior: *p has been reclaimed

40.Garbage Collection
Explicit reclamation of heap objects is a serious burden on the programmer and a major source of
bugs. The code required to keep track of object lifetimes makes programs more difficult to
design, implement and maintain.
Automatic garbage collection has become popular for imperative languages as well. Automatic
collection is difficult to implement, but the difficulty pales in comparison to the convenience
enjoyed by programmers once the implementation exists. Automatic collection also tends to be
slower than manual reclamation.
The simplest garbage collection technique simply places a counter in each object that keeps track
of the number of pointers that refer to the object. When the object is created, this reference count
is set to 1, to represent the pointer returned by the new operation. When one pointer is assigned
into another, the run time system decrements the reference count of the object formerly referred
to by the assignments left hand side and increments the count of the object referred to by the
right hand side. On subroutine return, the calling sequence epilogue must decrement the
reference count of any object referred to by a local pointer that is about to be destroyed. When a
reference count reaches zero, its object can be reclaimed. Recursively, the run time system must
decrement counts for any objects referred to by pointers within the object being reclaimed, and
reclaim those objects if their counts reach zero. To prevent the collector from following garbage
addresses, each pointer must be initialized to null at elaboration time.

Page 43

Principles of Programming Languages

[ Reference counts and circular lists]

41.Summarize the differences among mark- and sweep, stop- and-copy,


pointer reversal and generational garbage collection.
1). Mark and-Sweep: - The classic mechanism to identify useless blocks. It proceeds in there
main steps, executed by the garbage collector when the amount of free space remaining in the
heap falls below some minimum threshold.
a). The collector walks through the heap, tentatively marking every block as useless.
b). Beginning with all pointers outside the heap, the collector recursively explores all linked
data structures in the program, marking each newly discovered block as useful.
c). The collector again walks through the heap, moving every block that is still marked useless
to the free list.
2). Pointer Reversal :- When the collector explores the path to a given block, it reverses the
pointers it follows, so that each points back to the previous block instead of forward to the next.
As it explores, the collector keeps track of the current block and the block from where it came.
3). Stop and Copy: - In a language with variable size heap blocks, the garbage collector can
reduce external fragmentation by performing storage compaction. Many garbage collector
employ a technique known as stop and copy that achieves compaction. Specifically they divide
the heap into two regions of equal size. All allocation happens in the first half. When this half is
full, the collector begins its exploration of reachable data structures. Each reachable blocks is
copied into the second half of the heap, is overwritten with a useful flag and a pointer to the new
location. Any other pointer that refers to the same block is set to point to the new location. When
the collector finishes its exploration, all useful objects have been moved into the second half of
the heap, and nothing in the first half is needed anymore. The collector can therefore swap its
notion of first and second halves, and the program can continue.
4). Generational collection: - The heap is divided into multiple regions. When space runs low the
collector first examines the youngest region, which it assumes is likely to have the highest
Page 44

Principles of Programming Languages

proportion of garbage. Only if it is unable to reclaim sufficient space in this region does the
collector examine the next older region. To avoid leaking storage in long running systems, the
collector must be prepared, if necessary, to examine the entire heap. In most cases, however, the
overhead of collection will be proportional to the size of the youngest region only.
Any object that survives some small number of collections in its current region is promoted to
the next older region, in a manner reminiscent of stop and copy. Promotion requires, of course,
that pointers from old objects to new objects be updated to reflect the new locations. While such
old space to new space pointers tends to be rare, a generational collector must be able to find
them all quickly. At each pointer assignment, the compiler generates code to check whether the
new value is an old to new pointer, if so it adds the pointer to a hidden list accessible to the
collector.
5). Conservative Collection:- When space runs low, the collector tentatively marks all blocks in
the heap as useless. It then scans all word aligned quantities in the stack and in global storage. If
any of these words appears to contain the address of something in the heap, the collector marks
the block that contains that address as useful. Recursively, the collector then scans all wordaligned quantities in the block, and marks as useful any other blocks whose addresses are found
therein. Finally the collector reclaims any blocks that are still marked useless.

42.Why are Lists so heavily used in functional programming languages


A list is defined recursively as either the empty list or a pair consisting of an object and another
list. Lists are ideally suited to programming in functional and logic languages, which do most of
their work via recursion and higher order functions. In Lisp, in fact a program is a list, and can
extended itself at run time by constructing a list and executing it.

Lists in ML and Lisp:


Lists in ML are homogeneous: every element of the list must have the same type. Lisp
lists, by contrast are heterogeneous: any object may be placed in a list, so long as it is
never used in an inconsistent fashion. An ML list is usually a chain of blocks, each of
which contains an element and a pointer to the next block. A Lisp list is a chain of cons
cells, each of which contains two pointers, one to the element and one to the next cons
cell.
An ML list is enclosed in square brackets, with elements separated by commas:[a, b, c, d]
A Lisp list is enclosed in parentheses, with elements separated by white space: (a b c d).
In both cases, the notation represents a proper list- one whose innermost pair consists of
the final element and the empty list. In Lisp it is also possible to construct an improper
list, whose final pair contains two elements.
The most fundamental operations on lists are those that construct them from their
components or extract their components from them.
In Lisp:
( cons a (b))
=> (a b)
(car (a b))
=> a
(car nil )
=> ??
( cdr (a b c))
=> (b c)
(cdr (a))
=> nil
Page 45

Principles of Programming Languages

(cdr nil)
=>??
(append (a b) (c d)) => (a b c d)
Here we have used => to mean evaluates to. The car and cdr of the empty list (nil) are
defined to be nil in Common Lisp.
In ML the equivalent operations are written as follows:
a :: [b]
=> [a, b]
hd [a, b]
=> a
hd [ ]
=> run-time exception
t1 [a, b, c] => [b, c]
t1 [a]
=> nil
t1[ ]
=> run-time exception
[a, b] @ [c, d] => [a, b, c, d]

43.Discuss about Files and Input/Output


Input/output facilities allow a program to communicate with the outside world. Interactive I/O
generally implies communication with human users or physical devices, which work in parallel
with the running program, and whose input to the program may depend on earlier output from
the program. Files generally refer to off-line storage implemented by the operating system. Files
may be further categorized into those that are temporary and those that are persistent. Temporary
files exist for the duration of a single program run; their purpose is to store information that is
too large to fit in the memory available to the program. Persistent files allow a program to read
data that existed before the program began running, and to write data that will continue to exist
after the program has ended. Some languages provide built in file data types and special syntactic
constructs for I/O. Others relegate I/O entirely to library packages, which export a file type and a
variety of input and output subroutines. The principal advantage of language integration is the
ability to employ non-subroutine call syntax, and to perform operations that may not otherwise
be available to library routines.

Page 46

Principles of Programming Languages

Unit-2

44.Discuss about Static and Dynamic links


In a language with nested subroutines and static scoping, objects that lie in surrounding
subroutines, and that are thus neither local nor global, can be found by maintaining a static chain.
Each stack frame contains a reference to the frame of the lexically surrounding subroutine. This
reference is called the static link. By analogy, the saved value of the frame pointer, which will be
restored on subroutine return, is called the dynamic link. The static and dynamic links may or
may not be the same, depending on whether the current routine was called by its lexically
surrounding routine, or by some other routine nested in that surrounding routine.
Whether or not a subroutine is called directly by the lexically surrounding routine, we can be
sure that the surrounding routine is active; there is no other way that the current routine could
have been visible, allowing it to be called.

If subroutine D is called directly from B, then clearly Bs frame will already be on the stack. D is
nested inside of B, when control enters B that D comes into view. It can therefore be called by C,
or by any other routine that is nested inside C or D, but only because these are also within B.

Page 47

Principles of Programming Languages

45.Calling Sequences
Maintenance of the subroutine call stack is the responsibility of the calling sequence. Sometimes
the term calling sequence is used to refer to the combined operations of the caller, the prologue,
and the epilogue.
Tasks that must be accomplished on the way into a subroutine include passing parameters,
saving the return address, changing the program counter, changing the stack pointer to allocate
space, saving registers that contain important values and that may be overwritten by the callee,
changing the frame pointer to refer to the new frame, and executing initialization code for any
objects in the new frame that require it. Tasks that must be accomplished on the way out include
passing return parameters or function values, executing finalization code for any local objects
that require it, deallocating the stack frame, restoring other saved registers, and restoring the
program counter. Some of these tasks must be performed by the caller, because they differ from
call to call. Most of the tasks, however, can be performed either by the caller or the callee.
A Typical Calling Sequence:

The stack pointer (sp) points to the first unused location on the stack. The frame pointer (fp)
points to a location near the bottom of the frame. Space for all arguments is reserved in the stack,
even if the compiler passes some of them in registers.
To maintain this stack layout, the calling sequence might operate as follows.

Page 48

Principles of Programming Languages

The caller
1. Saves any caller-saves registers whose values will be needed after the call
2. Computes the values of arguments and moves them into the stack or
registers
3. Computes the static link, and passes it as an extra, hidden argument
4. Uses a special subroutine call instruction to jump to the subroutine,
simultaneously passing the return address on the stack or in a register
In its prologue, the callee
1. Allocates a frame by subtracting an appropriate constant from the sp
2. Saves the old frame pointer into the stack, and assigns it an appropriate new value
3. Saves any callee-saves registers that may be overwritten by the current routine
After the subroutine has completed, the epilogue
1.
2.
3.
4.

Moves the return value into a register or a reserved location in the stack
Restores callee-saves registers if needed
Restores the fp and the sp
Jumps back to the return address

Finally, the caller


1. Moves the return value to wherever it is needed
2. Restores caller-saves registers if needed
Displays:- One disadvantage of static chains is that access to an object in a scope K levels out
requires that the static chain be dereferenced K times. If a local object can be loaded into a
register with a single memory access, an object K levels out will require K+1 memory access.
This number can be reduced to a constant by use of a display.
Register Windows:- As an alternative to saving and restoring registers on subroutine calls and
returns, he original Berkeley RISC machines introduced a hardware mechanism known as
register windows. The basic idea is to map the ISAs limited set of register names onto some
subset of a much larger collection of physical registers, and to change the mapping when making
subroutine calls. Old and new mapping overlap a bit, allowing arguments to be passed in the
intersection.
In-Line Expansion:- As an alternative to stack- based calling conventions, many language
implementations allow certain subroutines to be expanded in line at the point of call. A coy of
the called routine becomes a part of the caller; no actual subroutine call occurs. In line
expansion avoids a variety of overheads, including space allocation, branch delays from the call
and return, maintaining the static chain or display, and saving and restoring registers. It also
allows the compiler to perform code improvements such as global register allocation, instruction
Page 49

Principles of Programming Languages

scheduling, and common subexpression elimination across the boundaries between subroutines,
something that most compilers cant do otherwise.
In many implementations, the compiler chooses which subroutines to expand in line and which
to compile conventionally. In some languages, the programmer can suggest that particular
routines be in-lined. In C++ and C99, the keyword inline can be prefixed to a function
declaration:
inline int max (int a, int b) { return a> b ? a : b;}
In Ada, the programmer can request in line expansion with a significant comment, or pragma:
function max (a, b : integer) return integer is
begin
If a>b then return a; else return b; end if;
end max;
pragma inline(max);

46.Parameter Passing
In computer programming, a parameter is a special kind of variable, used in a subroutine to
refer to one of the pieces of data provided as input to the subroutine. These pieces of data are
called arguments. An ordered list of parameters is usually included in the definition of a
subroutine, so that, each time the subroutine is called, its arguments for that call can be assigned
to the corresponding parameters. Parameter names that appear in the declaration of a subroutine
are known as formal parameters. Variables and expressions that are passed to a subroutine in a
particular call are known as actual parameters. Most languages use a prefix notation for calls to
user-defined subroutines, with the subroutine name followed by a parenthesized argument list.
Lisp places the function name inside the parentheses, as in (max a b).
The following program in C defines a function that is named "sales_tax" and has one parameter
named "price". The type of price is "double" (i.e. a double-precision floating point number). The
function's return type is also a double.
double sales_tax(double price)
{
return 0.05 * price;
}
After the function has been defined, it can be invoked as follows:
sales_tax(10.00);

Page 50

Principles of Programming Languages

In this example, the function has been invoked with the number 10.00. When this happens, 10.00
will be assigned to price, and the function begins calculating its result. The steps for producing
the result are specified below enclosed in {} "0.05 * price" indicates that the first thing to do is
multiply 0.05 by the value of price, which gives 0.50. "return" means the function will produce
the result of "0.05 * price". Therefore, the final result is 0.50.
Parameter Modes:- Some languages including C, Fortran, ML, and Lisp-define a single set of
rules that apply to all parameters. Other languages including Pascal, Modula, and Ada, provide
two or more set of rules, corresponding to different parameter passing modes. As in many
aspects of language design, the semantic details are heavily influenced by implementation issues.
Suppose for the moment that X is a global variable in a language with a value model of
variables, and that we wish to pass X as a parameter to subroutine P:
P(X);
From an implementation point of view, we have two principal alternatives: we may provide P
with a copy of Xs value, or we may provide it with Xs address. The two most common
parameter- passing modes, called call-by-value and call-by-reference, are designed to reflect
these implementations.
[ call-by-value- a parameter acts within the subroutine as a local (isolated) copy of the argument.
call-by-reference- the argument supplied by the caller can be affected by actions within the
called subroutine]
With value parameters, each actual parameter is assigned into the corresponding formal
parameter when a subroutine is called; from then on, the two are independent. With reference
parameters, each formal parameter introduces, within the body of subroutine, a new name for the
corresponding actual parameter. If the actual parameter is also visible within the subroutine
under its original name, then the two names are aliases for the same object, and changes made
through one will be visible through the other.
Variations on Value and Reference Parameters:- If the purpose of call-by-reference is to allow
the called routine to modify the actual parameter, we can achieve a similar effect using call-byvalue/result. Like call-by-value, call-by-value/result copies the actual parameter into the formal
parameter at the beginning of subroutine execution. Unlike call-by-value, it also copies the
formal parameter back into the actual parameter when the subroutine returns.
x: integer
-------- global
procedure foo (y : integer)
y:=3
print x
..
x:=2
foo (x)
print x

Page 51

Principles of Programming Languages

Here value/result would copy x into y at the beginning of foo, and y into x at the end of foo.
Because foo accesses x directly in between, the programs visible behavior would be different
than it was with call-by-reference: the assignment of 3 into y would not affect x until after the
inner print statement, so the program would print 2 and then 3.
In Pascal, parameters are passed by value by default; they are passed by reference if preceded by
the keyword var in their subroutine headers formal parameter list. Parameters in C are always
passed by value, though the effect for arrays is unusual: because of the interoperability of arrays
and pointers in C.
Call-by-Sharing:- Call by-value and call-by-reference make the most sense in a language with
a value model of variables: they determine whether we copy the variable or pass an alias for it.
Neither option really makes sense in a language like Smalltalk, Lisp, ML, or Clu, in which a
variable is already a reference. Here it is most natural simply to pass the reference itself, and let
the actual and formal parameters refer to the same object. Clu calls this mode call-by-sharing. It
is different from call-by-value because, although we do copy the actual parameter into the formal
parameter, both of them are references; if we modify the object to which the formal parameter
refers, the program will be able to see those changes through the actual parameter after the
subroutine returns. Call-by-sharing is also different from call-by-reference, because although the
called routine can change the value of the object to which the actual parameter refers, it cannot
change the identity of that object.
The Purpose of Call-by-Reference: In a language that provides both value and reference
parameters, there are two principal reasons why the programmer might choose one over the
other. First, if the called routine is supposed to change the value of an actual parameter, then the
programmer must pass the parameter by reference. Conversely, to ensure that the called routine
cannot modify the argument, the programmer can pass the parameter by value. Second the
implementation of value parameters requires copying actual to formals, a potentially timeconsuming operation when arguments are large. Reference parameters can be implemented
simply by passing an address.
Read-Only Parameters:- To combine the efficiency of reference parameters and the safety of
value parameters, Modula-3 provides a READONLY parameter mode. Any formal parameter
whose declaration is preceded by READONLY cannot be changed by the called routine: the
compiler prevents the programmer from using that formal parameter on the left-hand side of any
assignment statement,reading it from a file, or passing it by reference to any other subroutine.
Small READONLY parameters are generally implemented by passing a value; larger
READONLY parameters are implemented by passing an address.
The equivalent of READONLY parameters is also available in C, which allows any variable or
parameter declaration to be preceded by the keyword const.
void append_to_log(const huge_record * r) {.
..
append _to_log (&my_record);
Reference parameters in C++:- void swap (int &a, int &b) { int t = a; a=b; b=t; }

Page 52

Principles of Programming Languages

Parameters and Arguments:- These two terms are sometimes loosely used interchangeably; in
particular, "argument" is sometimes used in place of "parameter". Nevertheless, there is a
difference. Properly, parameters appear in procedure definitions; arguments appear in procedure
calls.
Although parameters are also commonly referred to as arguments, arguments are more properly
thought of as the actual values or references assigned to the parameter variables when the
subroutine is called at run-time. When discussing code that is calling into a subroutine, any
values or references passed into the subroutine are the arguments, and the place in the code
where these values or references are given is the parameter list. When discussing the code inside
the subroutine definition, the variables in the subroutine's parameter list are the parameters, while
the values of the parameters at runtime are the arguments. For example in C, when dealing with
threads it's common to pass in an argument of type void* and cast it to an expected type:

void ThreadFunction( void* pThreadArgument )


{
// Naming the first parameter 'pThreadArgument' is correct, rather than 'pThreadParameter'.
// At run time the value we use is an argument. As mentioned above, reserve the
// term parameter when discussing subroutine definitions.
}
Many programmers use parameter and argument interchangeably, depending on context to
distinguish the meaning. The term formal parameter refers to the variable as found in the
function definition (parameter), while actual parameter refers to the actual value passed
(argument).
To better understand the difference, consider the following function written in C:
int sum(int addend1, int addend2)
{
return addend1 + addend2;
}
The function sum has two parameters, named addend1 and addend2. It adds the values passed
into the parameters, and returns the result to the subroutine's caller (using a technique
automatically supplied by the C compiler).
The code which calls the sum function might look like this:
int sumValue;
int value1 = 40;
int value2 = 2;
sumValue = sum(value1, value2);
Page 53

Principles of Programming Languages

The variables value1 and value2 are initialized with values. value1 and value2 are both
arguments to the sum function in this context.
At runtime, the values assigned to these variables are passed to the function sum as arguments. In
the sum function, the parameters addend1 and addend2 are evaluated, yielding the arguments 40
and 2, respectively. The values of the arguments are added, and the result is returned to the
caller, where it is assigned to the variable sumValue.

47.Discuss about Generic Subroutines and Modules


Subroutines provide a natural way to perform an operation for a variety of different object
values. In larger programs, the need also often arises to perform an operation for a variety of
different object types. An operating system, for example, tends to make heavy use of queries, to
hold processes, memory descriptors, file buffers, device control blocks, and a host of other
objects. The characteristics of the queue data structure are independent of the characteristics of
the items placed in the queue. Unfortunately, the standard mechanisms for declaring enqueue and
dequeue subroutines in most languages require that the type of the items be declared statically.
Generic modules or classes are particularly valuable for creating containers- data abstractions
that hold a collection of objects, but whose operations are generally oblivious to the type of those
objects. Examples of containers include stack, queue, heap, set etc.
Generic subroutines are needed in generic modules, and may also be useful in their own right.
Generics can be implemented several ways. In most implementations of Ada and C++ they are a
purely static mechanism: all the work required to create and use multiple instances of the generic
code takes place at compile time. In the usual case, the compiler creates a separate copy of the
code for every instance. If several queues are instantiated with the same set of arguments, then
the compiler may share the code of the enqueue and dequeue routines among them. A clever
compiler may arrange to share the code for a queue of integers with the code for a queue of
single-precision floating point numbers, if the two types have the same size, but this sort of
optimization is not required, and the programmer should not be surprised if it doesnt occur.
Java 5 by contrast, guarantees that all instances of a given generic will share the same code at run
time. In effect, if T is a generic type parameter in java, then objects of class T are treated as
instances of the standard base class object, expect that the programmer does not have to insert
explicit casts to use them as objects of class T, and the compiler guarantees, statically, that the
elided casts will never fail.
Generic Parameter Constraints:- Because a generic is an abstraction, it is important that its
interface provide all the information that must be known by a user of the abstraction. Several
languages, including Clu, Ada, Java,and C#, attempt to enforce this rule by constraining generic
parameters. Specifically, they require that the operations permitted on a generic parameter type
explicitly declared.
type item is private;

Page 54

Principles of Programming Languages

A private type in Ada isone for which the only permissible operations are assignment,testing for
equality and inequality, and accessing a few standard attributes. To prohibit testing for equality
and inequality, the programmer can declare the parameter to be limited private. To allow
additional operations, the programmer must provide additional information. In simple cases, it
may be possible to specify a type pattern such as
type item is (< >);
Here the parantheses indicates that items is a discrete type, and will thus support such operations
as comparison for ordering (<, >, etc.) and the attributes first and last.
In more complex cases, the Ada programmer can specify the operations of a generic type
parameter by means of a trailing with clause
generic
type T is private;
type T _array is array (integer range < > ) of T;
with function < (a1, a2 :T)return boolean;
procedure sort ( A : in out T_array);
Without the with clause, procedure sort would be unable to compare elements of A for
ordering, because type T is private.
Generic sorting routine in Java:public static < T extends Comparable <T>> void sort ( T A [ ])
{
if ( A [i].compareTO (A [j]) >=0) .

}
.
Integer [ ] my Array = new Integer [50];

sort (myArray);
Page 55

Principles of Programming Languages

Generic sorting routine in C#:static void sort <T> (T [ ] A ) where T : IComparable {


..
if ( A [i]. CompareTo (A [j] )>=0) ..
.
}

int [ ] myArray = new int [50];


sort ( myArray);

Implicit Instantiation:- Because a class is a type, one must generally create an instance of a
generic class before the generic can be used.
Generic class instance in C++:
queue <int, 50> *my_queue= new queue <int, 50 >( );
Generic subroutine instance in Ada:
procedure int_sort is new sort (integer,int_array, <);
..
int_sort (my_array);

Other languages do not require this. Instead they treat generic subroutines as a form of
overloading.
int ints [10];
double reals [50];
string strings [30];
We can perform the following calls without instantiating anything explicitly:

Page 56

Principles of Programming Languages

Sort (ints, 10);


Sort (reals, 50);
Sort ( strings, 30);
In each case, the compilers will implicitly instantiate an appropriate version of the sort routine.
Java and C# have similar conventions. To keep the language manageable, the rules for implicit
instantiation in C++ are more restrictive than the rules for resolving overloaded subroutines in
general.

48.Exception Handling
An exception can be defined as an unexpected-or at least unusual-condition that arises during
program execution and that cannot easily be handled in the local context. It may be detected
automatically by the language implementation, or the program may raise it explicitly. The most
common exceptions are various sorts of run-time errors.
Exception handling in C++:
To catch exceptions we must place a portion of code under exception inspection. This is done by
enclosing that portion of code in a try block. When an exceptional circumstance arises within that
block, an exception is thrown that transfers the control to the exception handler. If no exception
is thrown, the code continues normally and all handlers are ignored.
An exception is thrown by using the throw keyword from inside the try block. Exception
handlers are declared with the keyword catch, which must be placed immediately after the try
block:
// exceptions
#include <iostream>
using namespace std;
int main () {
try
{
throw 20;
}
catch (int e)
{
cout << "An exception occurred. Exception Nr. " << e << endl;
}
return 0;
}

Page 57

Principles of Programming Languages

The code under exception handling is enclosed in a try block. In this example this code simply
throws an exception:
throw 20;

A throw expression accepts one parameter (in this case the integer value 20), which is passed as
an argument to the exception handler.
The exception handler is declared with the catch keyword. As you can see, it follows
immediately the closing brace of the try block. The catch format is similar to a regular function
that always has at least one parameter. The type of this parameter is very important, since the
type of the argument passed by the throw expression is checked against it, and only in the case
they match, the exception is caught.
We can chain multiple handlers (catch expressions), each one with a different parameter type.
Only the handler that matches its type with the argument specified in the throw statement is
executed.
If we use an ellipsis (...) as the parameter of catch, that handler will catch any exception no
matter what the type of the throw exception is. This can be used as a default handler that catches
all exceptions not caught by other handlers if it is specified at last:
try {
// code here
}
catch (int param) { cout << "int exception"; }
catch (char param) { cout << "char exception"; }
catch (...) { cout << "default exception"; }
In this case the last handler would catch any exception thrown with any parameter that is neither
an int nor a char.
After an exception has been handled the program execution resumes after the try-catch block, not
after the throw statement
It is also possible to nest try-catch blocks within more external try blocks. In these cases, we
have the possibility that an internal catch block forwards the exception to its external level. This
is done with the expression throw; with no arguments. For example:
try {
try {
// code here
}
catch (int n) {
throw;
}
Page 58

Principles of Programming Languages

}
catch (...) {
cout << "Exception occurred";
}
Exception handling in Java:
To understand how exception handling works in Java, you need to understand the three
categories of exceptions:

Checked exceptions: A checked exception is an exception that is typically a user error or


a problem that cannot be foreseen by the programmer. For example, if a file is to be
opened, but the file cannot be found, an exception occurs. These exceptions cannot
simply be ignored at the time of compilation.
Runtime exceptions: A runtime exception is an exception that occurs that probably
could have been avoided by the programmer. As opposed to checked exceptions, runtime
exceptions are ignored at the time of compliation.
Errors: These are not exceptions at all, but problems that arise beyond the control of the
user or the programmer. Errors are typically ignored in your code because you can rarely
do anything about an error. For example, if a stack overflow occurs, an error will arise.
They are also ignored at the time of compilation.

Catching Exceptions:A method catches an exception using a combination of the try and catch keywords. A try/catch
block is placed around the code that might generate an exception. Code within a try/catch block
is referred to as protected code, and the syntax for using try/catch looks like the following:
try
{
//Protected code
}catch(ExceptionName e1)
{
//Catch block
}
A catch statement involves declaring the type of exception you are trying to catch. If an
exception occurs in protected code, the catch block (or blocks) that follows the try is checked. If
the type of exception that occurred is listed in a catch block, the exception is passed to the catch
block much as an argument is passed into a method parameter.

Page 59

Principles of Programming Languages

Example:
The following is an array is declared with 2 elements. Then the code tries to access the 3rd
element of the array which throws an exception.
// File Name : ExcepTest.java
import java.io.*;
public class ExcepTest{
public static void main(String args[]){
try{
int a[] = new int[2];
System.out.println("Access element three :" + a[3]);
}catch(ArrayIndexOutOfBoundsException e){
System.out.println("Exception thrown :" + e);
}
System.out.println("Out of the block");
}
}
This would produce following result:
Exception thrown :java.lang.ArrayIndexOutOfBoundsException: 3
Out of the block
Multiple catch Blocks:
A try block can be followed by multiple catch blocks. The syntax for multiple catch blocks looks
like the following:
try
{
//Protected code
}catch(ExceptionType1 e1)
{
//Catch block
}catch(ExceptionType2 e2)
{
//Catch block
}catch(ExceptionType3 e3)
{
//Catch block

Page 60

Principles of Programming Languages

}
The previous statements demonstrate three catch blocks, but you can have any number of them
after a single try. If an exception occurs in the protected code, the exception is thrown to the first
catch block in the list. If the data type of the exception thrown matches ExceptionType1, it gets
caught there. If not, the exception passes down to the second catch statement. This continues
until the exception either is caught or falls through all catches, in which case the current method
stops execution and the exception is thrown down to the previous method on the call stack.

Example:
Here is code segment showing how to use multiple try/catch statements.
try
{
file = new FileInputStream(fileName);
x = (byte) file.read();
}catch(IOException i)
{
i.printStackTrace();
return -1;
}catch(FileNotFoundException f) //Not valid!
{
f.printStackTrace();
return -1;
}

49.Object-Oriented Programming
OBJECT-ORIENTED PROGRAMMING (OOP) represents an attempt to make programs more
closely model the way people think about and deal with the world. In the older styles of
programming, a programmer who is faced with some problem must identify a computing task
that needs to be performed in order to solve the problem. Programming then consists of finding a
sequence of instructions that will accomplish that task. But at the heart of object-oriented
programming, instead of tasks we find objects entities that have behaviors, that hold
information, and that can interact with one another. Programming consists of designing a set of
objects that model the problem at hand. Software objects in the program can represent real or
abstract entities in the problem domain. This is supposed to make the design of the program
more natural and hence easier to get right and easier to understand.
With the development of ever- more complicated computer applications, data abstraction has
become essential to software engineering. The abstraction provided by modules and module
types has at least three important benefits:

Page 61

Principles of Programming Languages

1. It reduces conceptual load by minimizing the amount of detail that the programmer must
think about at one time.
2. It provides fault containment by preventing the programmer from using a program
component in inappropriate ways, and by limiting the portion of a programs text in
which a given component can be used, thereby limiting the portion that must be
considered when searching for the cause of a bug.
3. It provides a significant degree of independence among program components, making it
easier to assign their construction to separate individuals, to modify their internal
implementations without changing external code that uses them, or to install them in a
library where they can be used by other programs.
Object-oriented programming can be seen as an attempt to enhance opportunities for code reuse
by making it easy to define new abstractions as extensions or refinements of existing
abstractions.
In contrast, the object-oriented approach encourages the programmer to place data where it is not
directly accessible by the rest of the program. Instead, the data is accessed by calling specially
written functions, commonly called methods, which are either bundled in with the data or
inherited from "class objects." These act as the intermediaries for retrieving or modifying the
data they control. The programming construct that combines data with a set of methods for
accessing and managing those data is called an object. The practice of using subroutines to
examine or modify certain kinds of data, however, was also quite commonly used in non-OOP
modular programming, well before the widespread use of object-oriented programming.
An object-oriented program will usually contain different types of objects, each type
corresponding to a particular kind of complex data to be managed or perhaps to a real-world
object or concept such as a bank account, a hockey player, or a bulldozer. A program might well
contain multiple copies of each type of object, one for each of the real-world objects the program
is dealing with. For instance, there could be one bank account object for each real-world account
at a particular bank. Each copy of the bank account object would be alike in the methods it offers
for manipulating or reading its data, but the data inside each object would differ reflecting the
different history of each account.
Objects can be thought of as wrapping their data within a set of functions designed to ensure that
the data are used appropriately, and to assist in that use. The object's methods will typically
include checks and safeguards that are specific to the types of data the object contains. An object
can also offer simple-to-use, standardized methods for performing particular operations on its
data, while concealing the specifics of how those tasks are accomplished. In this way alterations
can be made to the internal structure or methods of an object without requiring that the rest of the
program be modified. This approach can also be used to offer standardized methods across
different types of objects. As an example, several different types of objects might offer print
methods. Each type of object might implement that print method in a different way, reflecting
the different kinds of data each contains, but all the different print methods might be called in the
same standardized manner from elsewhere in the program. These features become especially
useful when more than one programmer is contributing code to a project or when the goal is to
reuse code between projects.
Page 62

Principles of Programming Languages

An object-oriented program may thus be viewed as a collection of interacting objects, as opposed


to the conventional model, in which a program is seen as a list of tasks (subroutines) to perform.
In OOP, each object is capable of receiving messages, processing data, and sending messages to
other objects. Each object can be viewed as an independent "machine" with a distinct role or
responsibility. The actions (or "methods") on these objects are closely associated with the object.
For example, OOP data structures tend to "carry their own operators around with them" (or at
least "inherit" them from a similar object or class) - except when they have to be serialized.
Objects as a formal concept in programming were introduced in the 1960s in Simula 67, a major
revision of Simula I, a programming language designed for discrete event simulation, created by
Ole-Johan Dahl and Kristen Nygaard of the Norwegian Computing Center in Oslo. Simula 67
was influenced by SIMSCRIPT and C.A.R. "Tony" Hoare's proposed "record classes".Simula
introduced the notion of classes and instances or objects (as well as subclasses, virtual methods,
coroutines, and discrete event simulation) as part of an explicit programming paradigm. The
language also used automatic garbage collection that had been invented earlier for the functional
programming language Lisp. Simula was used for physical modeling, such as models to study
and improve the movement of ships and their content through cargo ports. The ideas of Simula
67 influenced many later languages, including Smalltalk, derivatives of LISP (CLOS), Object
Pascal, and C++.
The Smalltalk language, which was developed at Xerox PARC (by Alan Kay and others) in the
1970s, introduced the term object-oriented programming to represent the pervasive use of
objects and messages as the basis for computation. Smalltalk creators were influenced by the
ideas introduced in Simula 67, but Smalltalk was designed to be a fully dynamic system in which
classes could be created and modified dynamically rather than statically as in Simula67.
Smalltalk and with it OOP were introduced to a wider audience by the August 1981 issue of Byte
Magazine.
In the 1970s, Kay's Smalltalk work had influenced the Lisp community to incorporate objectbased techniques that were introduced to developers via the Lisp machine. Experimentation with
various extensions to Lisp (like LOOPS and Flavors introducing multiple inheritance and
mixins), eventually led to the Common Lisp Object System (CLOS, a part of the first
standardized object-oriented programming language, ANSI Common Lisp), which integrates
functional programming and object-oriented programming and allows extension via a Metaobject protocol. In the 1980s, there were a few attempts to design processor architectures that
included hardware support for objects in memory but these were not successful. Examples
include the Intel iAPX 432 and the Linn Smart Rekursiv.
Object-oriented programming developed as the dominant programming methodology in the early
and mid 1990s when programming languages supporting the techniques became widely
available. These included Visual FoxPro 3.0, C++, and Delphi. Its dominance was further
enhanced by the rising popularity of graphical user interfaces, which rely heavily upon objectoriented programming techniques. An example of a closely related dynamic GUI library and
OOP language can be found in the Cocoa frameworks on Mac OS X, written in Objective-C, an
object-oriented, dynamic messaging extension to C based on Smalltalk. OOP toolkits also
enhanced the popularity of event-driven programming (although this concept is not limited to
Page 63

Principles of Programming Languages

OOP). Some feel that association with GUIs (real or perceived) was what propelled OOP into the
programming mainstream.
Object-oriented features have been added to many existing languages during that time, including
Ada, BASIC, Fortran, Pascal, and others. Adding these features to languages that were not
initially designed for them often led to problems with compatibility and maintainability of code.
More recently, a number of languages have emerged that are primarily object-oriented yet
compatible with procedural methodology, such as Python and Ruby. Probably the most
commercially important recent object-oriented languages are Visual Basic.NET (VB.NET) and
C#, both designed for Microsoft's .NET platform, and Java, developed by Sun Microsystems.
Both frameworks show the benefit of using OOP by creating an abstraction from implementation
in their own way. VB.NET and C# support cross-language inheritance, allowing classes defined
in one language to subclass classes defined in the other language. Developers usually compile
Java to bytecode, allowing Java to run on any operating system for which a Java virtual machine
is available. VB.NET and C# make use of the Strategy pattern to accomplish cross-language
inheritance, whereas Java makes use of the Adapter pattern.

50.Encapsulation and Inheritance


Encapsulation mechanism enable the programmer to group data and the sub-routines that
operate on them together in one place, and to hide irrelevant details from the users of an
abstraction.
As information hiding mechanism:
Under this definition, encapsulation means that the internal representation of an object is
generally hidden from view outside of the object's definition. Typically, only the object's own
methods can directly inspect or manipulate its fields. Some languages like Smalltalk and Ruby
only allow access via object methods, but most others (e.g. C++ or Java) offer the programmer a
degree of control over what is hidden, typically via keywords like public and private. It should be
noted that the ISO C++ standard refers to private and public as "access specifiers" and that they
do not "hide any information". Information hiding is accomplished by furnishing a compiled
version of the source code that is interfaced via a header file.
Hiding the internals of the object protects its integrity by preventing users from setting the
internal data of the component into an invalid or inconsistent state. A benefit of encapsulation is
that it can reduce system complexity, and thus increases robustness, by allowing the developer to
limit the interdependencies between software components.
Almost always, there is a way to override such protection - usually via reflection API (Ruby,
Java, C#, etc.), sometimes by mechanism like name mangling (Python), or special keyword
usage like friend in C++.

Page 64

Principles of Programming Languages

General Definition:
In general, Encapsulation is one of the 4 fundamentals of OOP (Object Oriented Programming).
Encapsulation is to hide the variables or something inside a class, preventing unauthorized
parties to use. So the public methods like getter and setter access it and the other classes call
these methods for accessing.
This mechanism is not unique to object-oriented programming. Implementations of abstract data
types, e.g. modules, offer a similar form of encapsulation.
Modules: Scope rules for data hiding were one of the principal innovations of Clu, Modula,
Euclid, and other module- based languages of the 1970s. In Clu and Euclid, the declaration and
definition of a module always appear together. The header clearly states which of the modules
names are to be exported. If a Euclid module M exports a type T, by default the remainder of the
program can do nothing with objects of type T other than pass them to subroutines exported from
M. T is said to be an opaque type.
In Modula-2, programmers have the option of separating the header and body of a module.The
only concession to data hiding is that a type may be made opaque by listing only its name in the
header:
TYPE T;
In this case variables of type T can only be assigned, compared for equality, and passed to the
modules subroutines.
Ada, which also allows the header and bodies of modules to be separated, eliminates the
problems of Modula-2 by allowing the header of a package to be divided into public and private
parts. A type can be exported opaquely by putting its definition in the private part of the header
and simply naming it in the public part:
package foo is

//header

type T is private;

private

// definitions below here are inaccessible to users

type T is ..

//full definition

..
Page 65

Principles of Programming Languages

end foo;
When the header and body of a module appear in separate files, a change to a module body never
requires us to recompile any of the modules users. A change to the private part of a module
header may require us to recompile the modules users, but never requires us to change their
code. A change to the public part of a header is a change to the modules interface: it will often
require us to change the code of users.
Because they affect only the visibility of names, static, manager-style modules introduce no
special code generation issues. Storage for variables and other data inside a module is managed
in precisely the same way as storage for data immediately outside the module. If the module
appears in a global scope, then its data can be allocated statically. If the module appears within a
subroutine, then its data can be allocated on the stack, at known offsets, when a subroutine is
called, and reclaimed when it returns.
Inheritance:In object-oriented programming (OOP), inheritance is a way to reuse code of existing objects,
establish a subtype from an existing object, or both, depending upon programming language
support. In classical inheritance where objects are defined by classes, classes can inherit
attributes and behavior (i.e., previously coded algorithms associated with a class) from preexisting classes called base classes or superclasses or parent classes or ancestor classes. The
new classes are known as derived classes or subclasses or child classes. The relationship of
classes through inheritance gives rise to a hierarchy. In prototype-based programming, objects
can be defined directly from other objects without the need to define any classes, in which case
this feature is called differential inheritance.
Subclasses and superclasses:
A subclass, or child class, is a modular, derivative class that inherits one or more properties from
another class (called the superclass, base class, or parent class). The superclass establishes a
common interface and foundational functionality, which specialized subclasses can inherit,
modify, and supplement. Because it offloads specialized operations to a subclass, a superclass is
more reusable. A subclass may customize or redefine a method inherited from the superclass. A
method redefined in this way is called a virtual method.
Applications:
Inheritance is used to co-relate two or more classes to each other. With the use of inheritance we
can use the methods and the instance variables of other classes in any other classes.
Overriding:
Many object-oriented programming languages permit a class or object to replace the
implementation of an aspecttypically a behaviorthat it has inherited. This process is usually
called overriding. Overriding introduces a complication: which version of the behavior does an
Page 66

Principles of Programming Languages

instance of the inherited class usethe one that is part of its own class, or the one from the
parent (base) class? The answer varies between programming languages, and some languages
provide the ability to indicate that a particular behavior is not to be overridden and behave
according to the base class. For instance, in C#, the overriding of a method should be specified
by the program. An alternative to overriding is hiding the inherited code.
Code reuse:
One of the earliest motivations for using inheritance was the reuse of code that already existed in
another class. This practice is usually called implementation inheritance. Before the objectoriented paradigm was in use, one had to write similar functionality over and over again. With
inheritance, behaviour of a superclass can be inherited by subclasses. It is not only possible to
call the overridden behaviour (method) of the ancestor (superclass) before adding other
functionalities, one can override the behaviour of the ancestor completely.
For instance, when programming animal behaviour, there may be the class of bird, of which all
birds are derived. All birds may use the functionality of flying, but some may fly with a different
techniques (swinging, using thermic winds like Albatroses). So, flying bird may use all the
behaviour of birds, or call it and add some other behaviour for the bird species. And some that
cannot fly anymore, like kiwi, would override it with a method having no behaviour at all.
In most quarters, class inheritance for the sole purpose of code reuse has fallen out of favor. The
primary concern is that implementation inheritance does not provide any assurance of
polymorphic substitutabilityan instance of the reusing class cannot necessarily be substituted
for an instance of the inherited class. An alternative technique, delegation, requires more
programming effort, but avoids the substitutability issue. In C++ private inheritance can be used
as form of implementation inheritance without substitutability. Whereas public inheritance
represents an "is-a" relationship and delegation represents a "has-a" relationship, private (and
protected) inheritance can be thought of as an "is implemented in terms of" relationship.
Inheritance vs subtyping:
Subtyping enables a given type to be substituted for another type or abstraction. Subtyping is
said to establish an is-a relationship between some existing abstraction, either implicitly or
explicitly, depending on language support. The relationship can be expressed explicitly via
inheritance in languages that support inheritance as a subtyping mechanism. For example, the
following C++ code establishes an explicit inheritance relationship between classes B and A
where B is a both a subclass and a subtype of A and can be used as an A wherever a reference
(i.e., a reference or a pointer) to an A is specified.
class A
{ public:
DoSomethingALike(){}
};
Page 67

Principles of Programming Languages

class B : public A
{ public:
DoSomethingBLike(){}
};

void UseAnA(A const& some_A)


{
some_A.DoSomethingALike();
}

void SomeFunc()
{
B b;
UseAnA(b); // b can be substituted for an A.
}

In programming languages that do not support inheritance as a subtyping mechanism, the


relationship between a base class and a derived class is only a relationship between
implementations (i.e., a mechanism for code reuse), as compared to a relationship between types.
Inheritance, even in programming languages that support inheritance as a subtyping mechanism,
does not necessarily entail behavioral subtyping. It is entirely possible to derive a class whose
object will behave incorrectly when used in a context where the parent class is expected. In
some, but not all OOP languages, the notions of code reuse and subtyping coincide because the
only way to declare a subtype is to define a new class that inherits the implementation of another.
Limitations and alternatives:
When using inheritance extensively in designing a program, one should note certain constraints
that it imposes.
For example, consider a class Person that contains a person's name, address, phone number, age,
gender, and race. We can define a subclass of Person called Student that contains the person's

Page 68

Principles of Programming Languages

grade point average and classes taken, and another subclass of Person called Employee that
contains the person's job-title, employer, and salary.
Design constraints:

Singleness: using single inheritance, a subclass can inherit from only one superclass.
Continuing the example given above, Person can be either a Student or an Employee, but
not both. Using multiple inheritance partially solves this problem, as one can then define
a StudentEmployee class that inherits from both Student and Employee. However, it can
still inherit from each superclass only once; this scheme does not support cases in which
a student has two jobs or attends two institutions.

Static: the inheritance hierarchy of an object is fixed at instantiation when the object's
type is selected and does not change with time. For example, the inheritance graph does
not allow a Student object to become a Employee object while retaining the state of its
Person superclass. (Although similar behavior can be achieved with the decorator
pattern.) Some have criticized inheritance, contending that it locks developers into their
original design standards.[4]

Visibility: whenever client code has access to an object, it generally has access to all the
object's superclass data. Even if the superclass has not been declared public, the client can
still cast the object to its superclass type. For example, there is no way to give a function
a pointer to a Student's grade point average and transcript without also giving that
function access to all of the personal data stored in the student's Person superclass. Many
modern languages, including C++ and Java, provide a "protected" access modifier that
allows subclasses to access the data, without allowing any code outside the chain of
inheritance to access it. This largely mitigates this issue.

The Composite reuse principle is an alternative to inheritance. This technique supports


polymorphism and code reuse by separating behaviors from the primary class hierarchy and
including specific behavior classes as required in any business domain class. This approach
avoids the static nature of a class hierarchy by allowing behavior modifications at run time and
allows a single class to implement behaviors buffet-style, instead of being restricted to the
behaviors of its ancestor classes.
Nesting (Inner Classes): Many languages allow class declarations to nest. C++ and C# allows
access to only the static members of the outer class, since these have only a single instance. In
effect, nesting serves simply as a means of information hiding. Java takes a more sophisticated
approach. It allows a nested class to access arbitrary members of its surrounding class. Each
instance of the inner class must therefore belong to an instance of the outer class.

51.Constructors
In object-oriented programming, a constructor (sometimes shortened to ctor) in a class is a
special type of subroutine called at the creation of an object. It prepares the new object for use,
often accepting parameters which the constructor uses to set any member variables required
Page 69

Principles of Programming Languages

when the object is first created. It is called a constructor because it constructs the values of data
members of the class.
A constructor resembles an instance method, but it differs from a method in that it never has an
explicit return-type, it is not inherited (though many languages provide access to the superclasses
constructor, for example through the super keyword in Java), and it usually has different rules for
scope modifiers. Constructors are often distinguished by having the same name as the declaring
class. They have the task of initializing the object's data members and of establishing the
invariant of the class, failing if the invariant isn't valid. A properly written constructor will leave
the object in a valid state. Immutable objects must be initialized in a constructor.
Types of constructors:
1. Parameterized constructors
constructors that can take arguments are termed as parameterized constructors. The number of
arguments can be greater or equal to one(1). For example:
class example
{
int p, q;
public:
example(int a, int b);
};
example :: example(int a, int b)
{
p = a;
q = b;
}

//parameterized constructor

When an object is declared in a parameterized constructor, the initial values have to be passed as
arguments to the constructor function. The normal way of object declaration may not work. The
constructors can be called explicitly or implicitly. The method of calling the constructor
implicitly is also called the shorthand method.
example e = example(0, 50);
example e(0, 50);

//explicit call
//implicit call

Page 70

Principles of Programming Languages

2. Default constructors
Default constructors define the actions to be performed by the compiler when a class object is
instantiated without actual parameters. Its sole purpose is to save the data members from the
garbage value. It initialize the data members with the default values.

3. Copy constructors
Copy constructors define the actions performed by the compiler when copying class objects. A
copy constructor has one formal parameter that is the type of the class (the parameter may be a
reference to an object).
It is used to create a copy of an existing object of the same class. Even though both classes are
the same, it counts as a conversion constructor.
4. Dynamic constructors
Allocation of memory to objects at the time of their construction is known as dynamic
construction of objects and such constructors are called as dynamic constructors. This results in
saving of memory as it enables the system to allocate the right amount of memory for each
object when the objects are not of the same size. The memory is allocated with the help of the
new operator. For example,
class String
{
char *name;
int length;
public:
String()
// constructor - 1
{
length = 0;
name = new char[length + 1];
}
String(char *e)
{
length = strlen(e);
// constructor - 2
name = new char[length + 1];
strcpy(name, e);
}
void join(String &a, String &b);
};
void String :: join(String &a, String &b)
Page 71

Principles of Programming Languages

{
length = a.length + b.length;
delete name;
name = new char[length + 1];
strcpy(name, a.name);
strcat(name, b.name);

// dynamic allocation

};

5. Conversion constructors
Conversion constructors provide a means for a compiler to implicitly create an object of a class
from an object another type. This type of constructor is different from copy constructor because
it creates an object from other class. but copy constructor is from the same class.
Syntax:

Java, C++, C#, ActionScript, and PHP 4, have a naming convention in which
constructors have the same name as the class of which they are associated with.
In PHP 5, a recommended name for a constructor is __construct. For backwards
compatibility, a method with the same name as the class will be called if __construct
method can not be found. Since PHP 5.3.3, this works only for non-namespaced classes.
In Perl, constructors are, by convention, named "new" and have to do a fair amount of
object creation.
In Visual Basic .NET, the constructor is called "New".
In Python, the constructor is called "__init__" and is always passed its parent class as an
argument, the name for which is generally defined as "self".

Java:
In Java, some of the differences between other methods and constructors are:

Constructors never have an explicit return type.


Constructors cannot be directly invoked (the keyword new must be used).
Constructors cannot be synchronized, final, abstract, native, or static.
Constructors are always executed by the same thread.

In Java, C#, and VB .NET for reference types the constructor creates objects in a special part of
memory called heap. On the other hand the value types (such as int, double etc.), are created in a
sequential memory called stack. VB NET and C# allow use of new to create objects of value
types. However, in those languages even use of new for value types creates objects only on
stack. In C++ when constructor is invoked without new the objects are created on stack. On the
other hand when objects are created using new they are created on heap which must be deleted
implicitly by a destructor or explicitly by a call to operator delete.

Page 72

Principles of Programming Languages

Most languages provides a default constructor if programmer provides no constructor. However,


this language provided constructor is taken away as soon as programmer provides any
constructor in the class code. In C++ a default constructor is REQUIRED if an array of class
objects is to be created. Other languages (Java, C#, VB .NET) have no such restriction.
In C++ copy constructor is called implicitly when class objects are returned from a method by
return mechanism or when class objects are passed by value to a function. C++ provides a copy
constructor if programmer provides no constructor at all. That is taken away as soon as any
constructor is provided by the programmer. C++ provided copy constructor ONLY makes
member-wise copy or shallow copies. For deep copies a programmer written copy constructor
that makes deep copies will be required. Generally a rule of three is observed. For a class that
should have a copy constructor to make deep copies, the three below must be provided. 1. Copy
constructor 2. Overloading of assignment operator. 3. A destructor. The above is called rule of
three in C++. If cloning of objects is not desired in C++ then copy constructor must be declared
private.
public class Example
{
//definition of the constructor.
public Example()
{
this(1);
}
//overloading a constructor
public Example(int input)
{
data = input; //This is an assignment
}
//declaration of instance variable(s).
private int data;
}
//code somewhere else
//instantiating an object with the above constructor
Example e = new Example(42);

Visual Basic .NET:


In Visual Basic .NET, constructors use a method declaration with the name "New".
Class Foobar
Private strData As String
' Constructor
Page 73

Principles of Programming Languages

Public Sub New(ByVal someParam As String)


strData = someParam
End Sub
End Class
' code somewhere else
' instantiating an object with the above constructor
Dim foo As New Foobar(".NET")
C#:
In C#, a constructor is thus.
public class MyClass
{
private int a;
private string b;
//constructor
public MyClass() : this(42, "string")
{
}
//overloading a constructor
public MyClass(int a, string b)
{
this.a = a;
this.b = b;
}
}
//code somewhere
//instantiating an object with the constructor above
MyClass c = new MyClass(42, "string");
C++:
In C++, the name of the constructor is the name of the class. It does not return anything. It can
have parameters, like any member functions (methods). Constructor functions should be declared
in the public section.
The constructor has two parts. First is the initializer list which comes after the parameter list and
before the opening curly bracket of the method's body. It starts with a colon and separated by
commas. You are not always required to have initializer list, but it gives the opportunity to
construct data members with parameters so you can save time (one construction instead of a
construction and an assignment). Sometimes you must have initializer list for example if you
have const or reference type data members, or members that cannot be default constructed (they
don't have parameterless constructor). The order of the list should be the order of the declaration
Page 74

Principles of Programming Languages

of the data members, because the execution order is that. The second part is the body which is a
normal method body surrounded by curly brackets.
C++ allows more than one constructor. The other constructors cannot be called, but can have
default values for the parameters. The constructor of a base class (or base classes) can also be
called by a derived class. Constructor functions cannot be inherited and their addresses cannot be
referred. When memory allocation is required, the operators new and delete are called implicitly.
A copy constructor has a parameter of the same type passed as const reference, for example
Vector(const Vector& rhs). If it is not implemented by hand the compiler gives a default
implementation which uses the copy constructor for each member variable or simply copies
values in case of primitive types. The default implementation is not efficient if the class has
dynamically allocated members (or handles to other resources), because it can lead to double
calls to delete (or double release of resources) upon destruction.
class Foobar {
public:
Foobar(double r = 1.0, double alpha = 0.0) // Constructor, parameters with default values.
: x(r*cos(alpha)) // <- Initializer list
{
y = r*sin(alpha); // <- Normal assignment
}
// Other member functions
private:
double x; // Data members, they should be private
double y;
};
Example invocations:
Foobar a,
b(3),
c(5, M_PI/4);
You can write a private data member function at the top section before writing public specifier.
If you no longer have access to a constructor then you can use the destructor.
Eiffel:
In Eiffel, the routines which initialize new objects are called creation procedures. They are
similar to constructors in some ways and different in others. Creation procedures have the
following traits:

Creation procedures never have an explicit return type (by definition of procedure).
Creation procedures are named. Names are restricted only to valid identifiers.

Page 75

Principles of Programming Languages

Creation procedures are designated by name as creation procedures in the text of the
class.
Creation procedures can be directly invoked to re-initialize existing objects.
Every effective (i.e., concrete or non-abstract) class must designate at least one creation
procedure.
Creation procedures must leave the newly initialized object in a state that satisfies the
class invariant.

The keyword create introduces a list of procedures which can be used to initialize instances. In
this case the list includes default_create, a procedure with an empty implementation inherited
from class ANY, and the make procedure coded within the class.
class
POINT
create
default_create, make
feature
make (a_x_value: REAL; a_y_value: REAL)
do
x := a_x_value
y := a_y_value
end
x: REAL
-- X coordinate
y: REAL
-- Y coordinate
...
ColdFusion:
ColdFusion has no constructor method. Developers using it commonly create an 'init' method
that acts as a pseudo-constructor.
<cfcomponent displayname="Cheese">
<!--- properties --->
<cfset variables.cheeseName = "" />
<!--- pseudo-constructor --->
<cffunction name="init" returntype="Cheese">
<cfargument name="cheeseName" type="string" required="true" />
<cfset variables.cheeseName = arguments.cheeseName />
<cfreturn this />
</cffunction>
</cfcomponent>
Page 76

Principles of Programming Languages

Pascal:
In Object Pascal, the constructor is similar to a factory method. The only syntactic difference to
regular methods is the keyword constructor in front of the name (instead of procedure or
function). It can have any name, though the convention is to have Create as prefix, such as in
CreateWithFormatting. Creating an instance of a class works like calling a static method of a
class: TPerson.Create("Peter").
program Program;
interface
type
TPerson = class
private
FName: string;
public
property Name: string read FName;
constructor Create(AName: string);
end;
implementation
constructor TPerson.Create(AName: string);
begin
FName := AName;
end;
var
Person: TPerson;
begin
Person := TPerson.Create("Peter"); // allocates an instance of TPerson and then calls
TPerson.Create with the parameter AName = "Peter"
end;
Perl:
In Perl version 5, by default, constructors must provide code to create the object (a reference,
usually a hash reference, but sometimes an array reference, scalar reference or code reference)
and bless it into the correct class. By convention the constructor is named new, but it is not
required, or required to be the only one. For example, a Person class may have a constructor
named new as well as a constructor new_from_file which reads a file for Person attributes, and
new_from_person which uses another Person object as a template.
package Person;
use strict;
Page 77

Principles of Programming Languages

use warnings;
# constructor
sub new {
# class name is passed in as 0th
# argument
my $class = shift;
# check if the arguments to the
# constructor are key => value pairs
die "$class needs arguments as key => value pairs"
unless (@_ % 2 == 0);
# default arguments
my %defaults;
# create object as combination of default
# values and arguments passed
my $obj = {
%defaults,
@_,
};
# check for required arguments
die "Need first_name and last_name for Person"
unless ($obj->{first_name} and $obj->{last_name});
# any custom checks of data
if ($obj->{age} && $obj->{age} < 18)) { # no under-18s
die "No under-18 Persons";
}
# return object blessed into Person class
bless $obj, $class;
}
1;
PHP:
In PHP (version 5 and above), the constructor is a method named __construct(), which the
keyword new automatically calls after creating the object. It is usually used to automatically
perform various initializations such as property initializations. Constructors can also accept
arguments, in which case, when the new statement is written, you also need to send the
constructor the function parameters in between the parentheses.
class Person
{
private $name;
function __construct($name)
{
$this->name = $name;
Page 78

Principles of Programming Languages

}
function getName()
{
return $this->name;
}
}

52.Destructors
In object-oriented programming, a destructor (sometimes shortened to dtor) is a method which
is automatically invoked when the object is destroyed. Its main purpose is to clean up and to free
the resources (which includes closing database connections, releasing network resources,
relinquishing resource locks, etc.) which were acquired by the object along its life cycle and
unlink it from other objects or resources invalidating any references in the process.
In binary programs compiled with the GNU C Compiler, special table sections called .dtors are
made for destructors. This table contains an array of addresses to the relevant functions that are
called when the main() function exits.[1] A function can be declared as a destructor function by
definining the destructor attribute __attribute__ ((destructor)) for a static function
In a language with an automatic garbage collection mechanism, it would be difficult to
deterministically ensure the invocation of a destructor, and hence these languages are generally
considered unsuitable for RAII. In such languages, unlinking an object from existing resources
must be done by an explicit call of an appropriate function (usually called Dispose()).

Destructor syntax:

C++ has the naming convention in which destructors have the same name as the class of
which they are associated with, but prefixed with a tilde (~).
In Object Pascal, destructors have the keyword "destructor" and can have user-defined
names (but are mostly called "Destroy").
In Perl, the destructor method is named DESTROY.
In Objective-C, the destructor method is named "dealloc".
In PHP 5, the destructor method is named "__destruct". There were no destructors in
previous versions of PHP.

In C++:
The destructor has the same name as the class, but with a tilde (~) in front of it. If the object was
created as an automatic variable, its destructor is automatically called when it goes out of scope.
If the object was created with a new expression, then its destructor is called when the delete

Page 79

Principles of Programming Languages

operator is applied to a pointer to the object. Usually that operation occurs within another
destructor, typically the destructor of a smart pointer object.
In inheritance hierarchies, the declaration of a virtual destructor in the base class ensures that the
destructors of derived classes are invoked properly when an object is deleted through a pointerto-base-class. Objects that may be deleted in this way need to inherit a virtual destructor.
A destructor should never throw an exception.
Example:
#include <iostream>
#include <string>
class foo
{
public:
foo( void )
{
print( "foo()" );
}
~foo( void )
{
print( "~foo()" );
}
void print( std::string const& text )
{
std::cout << static_cast< void* >( this ) << " : " << text << std::endl;
}
/*
Disable copy-constructor and assignment operator by making them private
*/
private:
foo( foo const& );
foo& operator = ( foo const& );
};
int main( void )
{
foo array[ 3 ];
/*
Page 80

Principles of Programming Languages

When 'main' terminates, the destructor is invoked for each element


in 'array' (the first object created is last to be destroyed)
*/
}
In C with GCC extensions:
The GNU Compiler Collection's C compiler comes with 2 extensions that allow to implement
destructors:

the "destructor" function attribute allows to define global prioritized destructor functions:
when main() returns, these functions are called in priority order before the process
terminates;
the "cleanup" variable attribute allows to attach a destructor function to a variable: the
function is called when the variable goes out of scope.

REALbasic:
Destructors in REALbasic can be in one of two forms. Each form uses a regular method
declaration with a special name (with no parameters and no return value). The older form uses
the same name as the Class itself with a ~ (tilde) prefix. The newer form uses the name
"Destructor". The newer form is the preferred one because it makes refactoring the class easier.
Class Foobar
// Old form
Sub ~Foobar()
End Sub
// New form
Sub Destructor()
End Sub
End Class

53.Dynamic Method Binding


In computer science, dynamic dispatch (also known as dynamic binding) is the process of
mapping a message to a specific sequence of code (method) at runtime. This is done to support
the cases where the appropriate method can't be determined at compile-time (i.e. statically).
Dynamic dispatch is only used for code invocation and not for other binding processes (such as
for global variables) and the name is normally only used to describe a language feature where a
runtime decision is required to determine which code to invoke.
This Object-Oriented feature allows substituting a particular implementation using the same
interface, and therefore it enables polymorphism.

Page 81

Principles of Programming Languages

Single and multiple dispatch:


Dynamic dispatch is needed when multiple classes contain different implementations of the
same method (for example foo()). If the class of an object x is not known at compile-time, then
when x.foo() is called, the program must decide at runtime which implementation of foo() to
invoke, based on the runtime type of object x. This case is known as single dispatch because an
implementation is chosen based on a single typethat of the type of the instance. Single
dispatch is supported by many object-oriented languages, including statically typed languages
such as C++ and Java, and dynamically typed languages such as Smalltalk and Objective-C.
In a small number of languages such as Common Lisp and Dylan, methods or functions can also
be dynamically dispatched based on the type of arguments. Expressed in pseudocode, the code
manager.handle(y) could call different implementations depending on the type of object y. This
is known as multiple dispatch.
Dynamic dispatch mechanisms:
A language may be implemented with different dynamic dispatch mechanisms. The choices of
the dynamic dispatch mechanism offered by a language to a large extent alter the programming
paradigms that are available or are most natural to use within a given language.
Normally, in a typed language, the dispatch mechanism will be performed based on the type of
the arguments (most commonly based on the type of the receiver of a message). This might be
dubbed 'per type dynamic dispatch'. Languages with weak or no typing systems often carry a
dispatch table as part of the object data for each object. This allows instance behaviour as each
instance may map a given message to a separate method.
Some languages offer a hybrid approach.
Dynamic dispatch will always incur an overhead so some languages offer the option to turn
dynamic dispatch off for particular methods.
C++ Implementation:
C++ uses a virtual table that defines the message to method mapping for a given class. Instances
of that type will then store a pointer to this table as part of their instance data. This is
complicated when multiple inheritance is used. The virtual table in a C++ object cannot be
modified at run-time, which limits the potential set of dispatch targets to a finite set chosen at
compile-time.
Although the overhead involved in this dispatch mechanism is low, it may still be significant for
some application areas that the language was designed to target. For this reason, Bjarne
Stroustrup, the designer of C++, elected to make dynamic dispatch optional and non-default.
Only functions declared with the virtual keyword will be dispatched based on the runtime type of
the object; other functions will be dispatched based on the object's static type.
Page 82

Principles of Programming Languages

Type overloading does not produce dynamic dispatch in C++ as the language considers the types
of the message parameters part of the formal message name. This means that the message name
the programmer sees is not the formal name used for binding.
JavaScript Implementation:
JavaScript stores the methods as part of the normal instance data. When an object is defined,
methods are loaded into it at any point and may be changed at any time. This allows for great
flexibility in the object models used to implement systems with the object definitions being
determined at runtime. JavaScript is in the family of prototype-based languages, and as such
method lookup must search the method dictionaries of parent prototype objects. Caching
strategies allow this to perform well.
There is a separate mechanism that allows a class template to be created with a single constructor
and to have default methods attached to it. These default methods will then be available to every
instance of that type.
Smalltalk Implementation:
Smalltalk uses a type based message dispatcher. Each instance has a single type whose definition
contains the methods. When an instance receives a message, the dispatcher looks up the
corresponding method in the message-to-method map for the type and then invokes the method.
A naive implementation of Smalltalk's mechanism would seem to have a significantly higher
overhead than that of C++ and this overhead would be incurred for each and every message that
an object receives.
In real Smalltalk implementations, a technique known as inline caching is often used that makes
method dispatch very fast. Inline caching basically stores the previous destination method
address and object class of this call site (or multiple pairs for multi-way caching). The cache
method is initialized with the most common target method (or just the cache miss handler), based
on the method selector. When the method call site is reached during execution, it just calls the
address in the cache (in a dynamic code generator, this call is a direct call as the direct address is
back patched by cache miss logic). Prologue code in the called method then compares the cached
class with the actual object class, and if they don't match, execution branches to a cache miss
handler to find the correct method in the class. A fast implementation may have multiple cache
entries and it often only takes a couple instructions to get execution to the correct method on an
initial cache miss. The common case will be a cached class match, and execution will just
continue in the method.

54.Multiple Inheritance
Multiple inheritance is a feature of some object-oriented computer programming languages in
which a class can inherit behaviors and features from more than one superclass.

Page 83

Principles of Programming Languages

Languages that support multiple inheritance include: C++, Common Lisp (via CLOS), EuLisp
(via The EuLisp Object System TELOS), Curl, Dylan, Eiffel, Logtalk, Object REXX, Scala (via
the use of mixin classes), OCaml, Perl, Perl 6, Python, and Tcl (via Incremental Tcl).
Some object-oriented languages, such as C#, Java, and Ruby implement single inheritance,
although protocols, or "interfaces," provide some of the functionality of true multiple inheritance.
Other object-oriented languages, such as PHP implement Traits class which is used to inherit
multiple functionalities.
Overview
In object-oriented programming (OOP), inheritance describes a relationship between two types,
or classes, of objects in which one is said to be a "subtype" or "child" of the other. The child
inherits features of the parent, allowing for shared functionality. For example, one might create a
variable class "Mammal" with features such as eating, reproducing, etc.; then define a subtype
"Cat" that inherits those features without having to explicitly program them, while adding new
features like "chasing mice".
If, however, one wants to use more than one totally orthogonal hierarchy simultaneously, such as
allowing "Cat" to inherit from "Cartoon character" and "Pet" as well as "Mammal", lack of
multiple inheritance often results in a very awkwardly mixed hierarchy, or forces functionality to
be rewritten in more than one place (with the attendant maintenance problems).
Languages have different ways of dealing with problems of repeated inheritance.

C++ requires the programmer to state which parent class the feature to be used is invoked
from i.e. "Worker::Human.Age". C++ does not support explicit repeated inheritance since
there would be no way to qualify which superclass to use (see criticisms). C++ also
allows a single instance of the multiple class to be created via the virtual inheritance
mechanism (i.e. "Worker::Human" and "Musician::Human" will reference the same
object).
The Common Lisp Object System allows full programmer control of method
combination, and if this is not enough, the Metaobject Protocol gives the programmer a
means to modify the inheritance, method dispatch, class instantiation, and other internal
mechanisms without affecting the stability of the system.
Curl allows only classes that are explicitly marked as shared to be inherited repeatedly.
Shared classes must define a secondary constructor for each regular constructor in the
class. The regular constructor is called the first time the state for the shared class is
initialized through a subclass constructor, and the secondary constructor will be invoked
for all other subclasses.
Eiffel allows the programmer to explicitly join or separate features that are being
inherited from superclasses. Eiffel will automatically join features together, if they have
the same name and implementation. The class writer has the option to rename the
inherited features to separate them. Eiffel also allows explicit repeated inheritance such
as A: B, B.
Page 84

Principles of Programming Languages

Perl uses the list of classes to inherit from as an ordered list. The compiler uses the first
method it finds by depth-first searching of the superclass list or using the C3 linearization
of the class hierarchy. Various extensions provide alternative class composition schemes.
The order of inheritance affects the class semantics (see criticisms).
Python has the same structure as Perl, but unlike Perl includes it in the syntax of the
language. The order of inheritance affects the class semantics (see criticisms).
Tcl allows multiple parent classes- their serial affects the name resolution for class
members.

Criticisms:
Multiple inheritance has been criticized by some for its unnecessary complexity and being
difficult to implement efficiently, though some projects have certainly benefited from its use.
Java, for example, has no multiple inheritance, as its designers felt that it would add unnecessary
complexity.
Multiple inheritance in languages with C++-/Java-style constructors exacerbates the inheritance
problem of constructors and constructor chaining, thereby creating maintenance and extensibility
problems in these languages. Objects in inheritance relationship with greatly varying
construction methods are hard to implement under the constructor-chaining paradigm.
Criticisms for the problems that it causes in certain languages, in particular C++, are:

Semantic ambiguity often summarized as the diamond problem (although solvable by


using virtual inheritance or 'using' declarations).
Not being able to explicitly inherit multiple times from a single class (on the other hand
this feature is criticised as non-object-oriented).
Order of inheritance changing class semantics (although it's the same with order of field
declarations).

55.Functional Programming Concepts


Functional programming defines the outputs of a program as a mathematical function of the
inputs, with no notion of internal state, and thus no side effects.
Features of functional programming languages:
1. First class function values and higher order functions
2. Extensive polymorphism
3. List types and operations
4. Structured function returns
Page 85

Principles of Programming Languages

5. Constructors for structured objects


6. Garbage collection
First class values are one that can be passed as a parameter, returned from a subroutine, or
assigned into a variable. Under a strict interpretation of the term, first class status also requires
the ability to create new values at run time. In the case of subroutines, this notion of first class
status requires nested lambda expressions that can capture values defined in surrounding scopes.
A higher order function takes a function as an argument, or returns a function as a result.
Polymorphism is important in functional languages because it allows a function to be used on as
general a class of arguments as possible.
Lists are important in functional languages because they have a natural recursive definition, and
are easily manipulated by operating on their first element and the remainder of the list
A pure functional language must provide completely general aggregates: because there is no way
to update existing objects, newly created ones must be initialized all at once.
Functional languages employ a heap for all dynamically allocated data.
Comparison of functional and Imperative language(Fortran, Ada,Cobol):1. Functional languages can have a very simple syntactic structure. Imperative languages
have a more complex syntax
2. The semantics of functional languages can also be simple compared to that of imperative
languages
3. Understanding concurrent programs in imperative languages is much more difficult
4. Programs written in functional programming languages were far slower than equivalent
programs in an imperative language
5. The closeness of the functional programming to mathematics make them less accessible
to many programmers

56.Overview of Scheme
Most scheme implementations employ an interpreter that runs a read-eval-print loop. The
interpreter repeatedly reads an expression from standard input, evaluates that expression, and
prints the resulting value. If the user types
(+ 3 4)
The interpreter will print 7
If the user types 7
Page 86

Principles of Programming Languages

The interpreter will also print 7


Most Scheme implementations provide a load function that reads input from a file:
(load my_Scheme_program)
Scheme uses Cambridge Polish notation for expressions. Parentheses indicate a function
application. The first expression inside the left parenthesis indicates the function; the remaining
expressions are its arguments. Suppose the user types: ( ( + 3 4))
When it sees the inner set of parentheses, the interpreter will call the function +, passing 3 and 4
as arguments. Because of the outer set of parentheses, it will then attempt to call 7 as a zeroargument function- a run time error.
Extra parentheses change the semantics of Scheme programs:
(+ 3 4 ) =>7
((+ 3 4)) => error
Here the => means evaluate to. This symbol is not a part of the syntax of Scheme itself.
One can prevent the Scheme interpreter from evaluating a parenthesized expression by quoting
it:
(quote (+ 3 4)) => (+3 4)
Here the result is a three- element list. More commonly, quoting is specified with a special
shorthand notation consisting of a leading single quote mark:
(+ 3 4) => (+3 4)
A symbol in Scheme is comparable to what other languages call an identifier. The lexical rules
for identifiers vary among Scheme implementations, but are in general much looser than they are
in other languages.
(symbol ? x$_%:&=*!) => #t
The symbol #t represents the Boolean value true. False is represented by #f.
Lambda Expressions:
Lambda provides a method for defining nameless functions:
(lambda (X) (* X X)) => function

Page 87

Principles of Programming Languages

The first argument to lambda is a list of formal parameters for the function. The remaining
arguments constitute the body of the function.
Eg: (( lambda (X) (* X X )) 3) => 9

Bindings:- Names can be bound to values by introducing a nested scope:


(let (( a 3)
( b 4)
( square ( lambda (X) (* X X )))
(Plus +)
(sqrt (plus (square a) ( square b)))) =>5.0
The special form let takes two or more arguments. The first of these is a list of pairs. In each
pair, the first element is a name and the second is the value that the name is to represent within
the remaining arguments to let. Remaining arguments are then evaluated in order; the value of
the construct as a whole is the value of the final argument.
The scope of the bindings produced by let is lets second argument only:
(let (( a 3))
(let ((a 4)
(b a))
(+ a b))) =>7
Here b takes the value of outer a
The way in which names become visible all at once at the end of the declaration list precludes
the definition of recursive functions. For these one employs letrec:
( letrec (( fact
(lambda (n)
(if (= n 1)1
( * n (fact (- n 1)))))))
(fact 5)) => 120
There is also a let* construct in which names become visible one at a time so that later ones can
make use of earlier ones, but not vice versa.
List and numbers:- Scheme provides a wealth of functions to manipulate lists. The three most
important are,
1. Car
2. Cdr
3. Cons
Car- Car function returns the first element of a given list.
Ex: ( Car (2 3 4)) =>2
( Car (( AB) CD )) => (AB)
( Car (A)) => (A)

Page 88

Principles of Programming Languages

Cdr- Cdr function returns the remainder of a given list, after its car has been removed.
Ex: (Cdr (ABC)) => (BC)
(Cdr ((AB) CD )) => (CD)
Cons- Cons is a primitive list constructor. It builds a list from its two arguments.
(Cons 2 (3 4)) => (2 3 4)

57.Features of LISP
Functional programming was introduced in 1958 in the form of LISP by John McCarthy.
Applications of LISP:
1. LISP is a versatile and a powerful language
2. LISP was developed for symbolic computations and list processing applications.
Which lie mainly in the AI field of computing
3. In many AI applications, LISP and its derivatives are still the standard language
4. Within AI a number of areas have been developed primarily through the use of
LISP. LISP also dominates in the areas of knowledge representation, machine
learning, intelligent training system and the modeling of speech
Features of LISP:

Provide list processing (which already existed in languages such as Information


Processing
Uses a prefix notation (emphasizing the operator rather than the operands of an
expression);
Uses the concept of function as widely as possible (cons for list construction; car and
cdr
a + b is written in LISP as (+ a b). All expressions are enclosed in parentheses and can be
nested to arbitrary depth.
There is a simple relationship between the text of an expression and its representation in
memory. An atom is a simple object such as a name or a number. A list is a data
structure composed of cons-cells (so called because they are constructed by the function
cons); each cons-cell has two pointers and each pointer points either to another cons-cell
or to an atom.
The names car and cdr originated in IBM 704 hardware; they are abbreviations for
contents of address register (the top 18 bits of a 36-bit word) and contents of
decrement register (the bottom 18 bits).
It is easy to translate between list expressions and the corresponding data structures.
There is a function eval (mentioned in the quotation above) that evaluates a stored list
expression. Consequently, it is straightforward to build languages and systems on top
of LISP and LISP is often used in this way.
It is interesting to note that the close relationship between code and data in LISP mimics
the von Neumann architecture at a higher level of abstraction.

Page 89

Principles of Programming Languages

Names. In procedural PLs, a name denotes a storage location (value semantics). In LISP,
a name is a reference to an object, not a location (reference semantics). In the Algol
sequence:
int n;
n := 2;
n := 3;
The declaration int n; assigns a name to a location, or box, that can contain an integer.
The next two statements put different values, first 2 then 3, into that box. In the LISP
Sequence:
(progn
(setq x (car structure))
(setq x (cdr structure)))
x becomes a reference first to (car structure) and then to (cdr structure). The two objects
have different memory addresses. A consequence of the use of names as references to
objects is that eventually there will be objects for which there are no references: these
objects are garbage and must be automatically reclaimed if the interpreter is not to run
out of memory. The alternative requiring the programmer to explicitly deallocate old
cells would add considerable complexity to the task of writing LISP programs.
Nevertheless, the decision to include automatic garbage collection (in 1958!) was
courageous and influential.
Lambda. LISP uses lambda expressions, based on Churchs _-calculus, to denote
functions.
For example, the function that squares its argument is written as
(lambda (x) (* x x))
For example, the expression ((lambda (x) (* x x)) 4)
yields the value 16.
However, the lambda expression itself cannot be evaluated. Consequently, LISP had to
resort to programming tricks to make higher order functions work. For example, if we
want to pass the squaring function as an argument to another function, we must wrap it
up in a special form called function:
(f (function (lambda (x) (* x x))) . . . .)
Dynamic Scoping. Dynamic scoping was an accidental feature of LISP: it arose as a
side-effect of the implementation of the look-up table for variable values used by the
interpreter. The C-like program in illustrates the difference between static and dynamic
scoping.
int x = 4; //1
void f()
{
printf("%d", x);
}
void main ()
{
int x = 7; //2
f();
}
Page 90

Principles of Programming Languages

The variable x in the body of the function f is a use of the global variable x defined in the
first line of the program. Since the value of this variable is 4, the program prints 4.
A LISP interpreter constructs its environment as it interprets. The environment behaves
like a stack (last in, first out). The initial environment is empty, which we denote by hi.
After interpreting the LISP equivalent of the line commented with 1, the environment
contains the global binding for x: hx = 4i. When the interpreter evaluates the function
main, it inserts the local x into the environment, obtaining hx = 7, x = 4i. The interpreter
then evaluates the call f(); when it encounters x in the body of f, it uses the first value of x
in the environment and prints 7.

Although dynamic scoping is natural for an interpreter, it is inefficient for a compiler.


Interpreters are slow anyway, and the overhead of searching a linear list for a variable
value just makes them slightly slower still. A compiler, however, has more efficient ways
of accessing variables, and forcing it to maintain a linear list would be unacceptably
inefficient. Consequently, early LISP systems had an unfortunate discrepancy: the
interpreters used dynamic scoping and the compilers used static scoping.

Interpretation. LISP was the first major language to be interpreted. Originally, the LISP
interpreter behaved as a calculator: it evaluated expressions entered by the user, but its
internal state did not change. It was not long before a form for defining functions was
introduced to enable users to add their own functions to the list of built-in functions.
A LISP program has no real structure. On paper, a program is a list of function
definitions; the functions may invoke one another with either direct or indirect recursion.
At run-time, a program is the same list of functions, translated into internal form, added
to the interpreter.

Basic LISP Functions


The principal and only data structure in classical LISP is the list. Lists are written
(a b c). The following functions are provided for lists.
_ cons builds a list, given its head and tail.
_ first (originally called car) returns the head (first component) of a list.
_ rest (originally called cdr) returns the tail of a list.
_ second (originally called cadr, short for car cdr) returns the second element of a list.
_ null is a predicate that returns true if its argument is the empty list.
_ atom is a predicate that return true if its argument is not a list and must therefore be
an atom, that is, a variable name or other primitive object.
Dynamic Binding
The classical LISP interpreter was implemented while McCarthy was still designing the
language. It contains an important defect that was not discovered for some time. James
Slagle defined a function like this
(def testr (lambda (x p f u)
(cond
((p x) (f x))
((atom x) (u))
(t (testr (rest x) p f (lambda () (testr (first x) p f u)))) ) ))
Page 91

Principles of Programming Languages

and it did not give the correct result.


We use a simpler example to explain the problem. Suppose we define:
(def show (lambda () x ))
Calling this function generates an error because x is not defined. (The interpreter above
does not incorporate error detection and would simply fail with this function.) However,
we can wrap show inside another function:
(def try (lambda (x) (show) ))

Correcting the LISP interpreter to provide static binding is not difficult: it requires a
slightly more complicated data structure for environments. However, the dynamic
binding problem was not discovered until LISP was well-entrenched: it would be almost
20 years before Guy Steele introduced Scheme, with static binding, in 1978.

58.Evaluation Order Revisited


The subcomponents of many expressions can be evaluated in more than one order. In particular,
one chooses to evaluate function arguments before passing them to a function, or to pass them
unevaluated. The former option is called applicative order evaluation; the later is called normalorder evaluation. Like most imperative languages, Scheme uses applicative order in most cases.
Applicative and normal-order evaluation:
( define double (lambda (X) ( + XX)))
Evaluating the expression (double (* 3 4)) in applicative order, we have
(double ( * 3 4))
(double 12)
(+12 12)
24
Under normal order evaluation we would have
(double (* 3 4))
( + ( * 3 4) ( * 3 4))
(+12 ( * 3 4))
(+ 12 12)
24
Evaluating the expression (switch -1 (+1 2) (+ 2 3) (+ 3 4)) in applicative order, we have
(switch -1 ( + 1 2) ( + 2 3) ( + 3 4))
(switch -1 3 (+ 2 3 ) ( + 3 4))
(switch -1 3 5 ( + 3 4 ))
(switch -1 3 5 7)
(cond ((< -1 0 ) 3)
((= -1 0) 5)
((> -1 0 ) 7 ))
(cond (#t 3)
(( = -1 0) 5)
(( > -1 0 ) 7))
Page 92

Principles of Programming Languages

=>3
Under normal order evaluation we would have
(switch -1 (+ 1 2 ) (+ 2 3) (+ 3 4))
(cond ((< -1 0) ( + 1 2))
(( = -1 0 ) ( + 2 3))
(( > -1 0) ( + 3 4 )))
( cond (#t ( + 1 2))
(( = -1 0 ) ( + 2 3 ))
(( > -1 0) ( + 3 4 )))
(+ 1 2)
3
Strictness and Lazy Evaluation:
Evaluation order can have an effect not only on execution speed, but on program correctness as
well. A program that encounters a dynamic semantic error or an infinite regression in an
unneeded subexpression under applicative order evaluation may terminate successfully under
normal- order evaluation. A function is said to be strict if it is undefined when any of its
arguments is undefined. Such a function can safely evaluate all its arguments, so its result will
not depend on evaluation order. A function is said to be nonstrict if it does not impose this
requirement- that is, if it is sometimes defined even when one of its arguments is not. A language
is said to be strict if it is defined in such a way that functions are always strict. A language is said
to be nonstrict if it permits the definition of nonstrict functions. If a language always evaluates
expression in applicative order, then every function is guaranteed to be strict, because whenever
an argument is undefined, its evaluation will fail and so will the function to which it is being
passed.
Lazy evaluation gives us the advantage of normal order evaluation while running within a
constant factor of the speed of applicative order evaluation for expressions in which everything is
needed.
(double ( * 3 4)) will be compiled as (double (f)), where f is a hidden closure with an internal
side effect:
(define f
(lambda ( )
(let (( done #f)
;memo initially unset
(memo( ))
(code (lambda ( ) (* 3 4 ))))
( if done memo
; if memo is set, return it
( begin
(set!memo ( code)) ;remember value
Memo )))))
; and return it
..

Page 93

Principles of Programming Languages

(double (f))
( + (f) (f))
(+ 12 (f))
( + 12 12)
24

;first call computes value


; second call returns remembered value

Lazy evaluation is particularly useful for infinite data structures. It can also be useful in
programs that need to examine only a prefix of a potentially long list. Lazy evaluation is used for
all arguments in Miranda and Haskell. It is available in Scheme through explicit use of delay and
force. Where normal order evaluation can be thought of as function evaluation using call-byname parameters, lazy evaluation is sometimes said to employ call-by-need. In addition to
Miranda and Haskell, call-by need, can be found in the R scripting language, widely used by
statisticians.
The principal problem with lazy evaluation is its behavior in the presence of side effects. If an
argument contains a reference to a variable that may be modified by an assignment, then the
value of the argument will depend on whether it is evaluated before or after the assignment.
Likewise, if the argument contains an assignment, values elsewhere in the program may depend
on when evaluation occurs. These problems do not arise in Miranda or Haskell because they are
purely functional: there are no side effects.

59.Higher order functions


A function is said to be a higher order function if it takes a function as an argument, or returns a
function as a result.
Examples of higher order function call/cc, for-each, compose and apply
Map function in Scheme:- Map calls its function argument on corresponding sets of elements
from the lists:
(map * (2 4 6)(3 5 7)) => (6 20 42)
Map is purely functional; it returns a list composed of the values returned by its function
argument.
Programmers in Scheme can easily define other higher order functions. Suppose, for example
that we want to be able to fold the elements of a list together, using an associative binary
operator:
(define fold ( lambda (f i l)
(if (null? l) i ; i is commonly the identity element for f
( f ( car l) ( fold f I (cdr l))))))
Page 94

Principles of Programming Languages

Now (fold +0(1 2 3 4 5)) gives us the sum of the first five natural numbers, and (fold * 1(1 2 3
4 5 )) gives us their product.
One of the most common uses of higher order functions is to build new functions from existing
ones:
( define total ( lambda (1) ( fold + 0 1 )))
( total (1 2 3 4 5 ))
=>15
( define total-all (lambda (1)
(map total 1 )))
(total-all((1 2 3 4 5)
(2 4 6 8 10)
( 3 6 9 12 15)))

=>(15 30 45)

Currying:- Curry is to replace a multiargument function with a function that takes a single
argument and returns a function that expects the remaining arguments:
(define curried-plus (lambda (a) (lambda (b) (+ a b))))
((curried-plus 3) 4)
=>7
(define plus -3 (curried-plus 3 ))
(plus-3 4)
=>7
Among other things, currying gives us the ability to pass a partially applied function to a higher
order function:
(map (curried-plus 3) (1 2 3))

=>(4 5 6)

ML and its descendants make it especially easy to define curried functions. Consider the
following function in ML:
Fun plus ( a, b) : int = a + b;
=>val plus = fn : int * int ->int
We can declare a single argument function without parenthesizing its formal argument:
Fun twice n : int = n + n;
=> val twice = fn : int -> int
twice 2;
=>val it = 4 : int
We can add parentheses in either the declaration or the call if we want, but because there is no
comma inside, no tuple is implied:
fun double (n) : int = n + n;
twice (2);
Page 95

Principles of Programming Languages

==>val it = 4 : int
twice 2;
==>val it = 4 : int
double (2);
==>val it = 4 : int
double 2;
==>val it = 4 : int
Ordinary parentheses can be placed around any expression in ML.
Using tuple notation, our fold function might be declared as follows in ML :
fun fold (f, i, l) =
case l of
nil => i
h : : t => f ( h, fold (f ,i, t));
==>val fold = fn : (a * b -> b) * b * a list ->b

60.Logic Programming
Logic programming means the use of mathematical logic for computer programming.
Or
Programming that uses a form of symbolic logic as a programming language is often called logic
programming and languages based on symbolic logic are called logic programming languages or
declarative languages.
Applications:1. RDBMS-RDBMS store data in the form of tables. Queries on such databases are
stated in SQL. Tables of information can be described by Prolog structures and
relationship between tables can be easily described by Prolog rules. One of the
advantages of using logic programming to implement an RDBMS is that only a
single language is required. The primary disadvantage of using logic
programming for an RDBMS, compared with a conventional RDBMS, is that the
logic programming implementation is slower.
2. Natural Language Processing- Certain kinds of natural language processing can
be done with logic programming. In particular natural language interfaces to
computer software systems, such as intelligent databases and other intelligent
knowledge base system can be conveniently done with logic programming.
3. Expert System:- Expert systems are computer systems designed to emulate human
expertise in some particular domain. Logic programming is suited to deal with the
problems like inconsistencies and incompleteness of the database in an expert
system. Prolog is used to construct expert systems. It can easily fulfill the basic
needs of expert systems using resolution as the basis for query processing.

Page 96

Principles of Programming Languages

61.Features of Prolog (Logic programming Language)


Prolog is a general purpose logic programming language associated with artificial intelligence
and computational linguistics
Prolog was one of the first logic programming languages, and remains among the most popular
such languages today, with many free and commercial implementations available. While initially
aimed at natural language processing, the language has since then stretched far into other areas
like theorem proving, expert systems, games, automated answering systems, and sophisticated
control systems. Modern Prolog environments support creating graphical user interfaces, as well
as administrative and networked applications.
In Prolog, program logic is expressed in terms of relations, and a computation is initiated by
running a query over these relations. Relations and queries are constructed using Prolog's single
data type, the term.
Data types:Prolog's single data type is the term. Terms are either atoms, numbers, variables or compound
terms.

An atom is a general-purpose name with no inherent meaning. Examples of atoms


include x, blue, 'Burrito', and 'some atom'.
Numbers can be floats or integers.
Variables are denoted by a string consisting of letters, numbers and underscore
characters, and beginning with an upper-case letter or underscore. Variables closely
resemble variables in logic in that they are placeholders for arbitrary terms.
A compound term is composed of an atom called a "functor" and a number of
"arguments", which are again terms. Compound terms are ordinarily written as a functor
followed by a comma-separated list of argument terms, which is contained in
parentheses. The number of arguments is called the term's arity. An atom can be regarded
as a compound term with arity zero. Examples of compound terms are
truck_year('Mazda', 1986)

Negation:The built-in Prolog predicate \+/1 provides negation as failure, which allows for non-monotonic
reasoning. The goal \+ legal(X) in the rule
illegal(X) :- \+ legal(X).
is evaluated as follows: Prolog attempts to prove the legal(X). If a proof for that goal can be
found, the original goal (i.e., \+ legal(X)) fails. If no proof can be found, the original goal
succeeds. Therefore, the \+/1 prefix operator is called the "not provable" operator, since the
query ?- \+ Goal. succeeds if Goal is not provable. This kind of negation is sound if its argument
is "ground" (i.e. contains no variables).
Page 97

Principles of Programming Languages

Tail recursion
Prolog systems typically implement a well-known optimization method called tail call
optimization (TCO) for deterministic predicates exhibiting tail recursion or, more generally, tail
calls: A clause's stack frame is discarded before performing a call in a tail position. Therefore,
deterministic tail-recursive predicates are executed with constant stack space, like loops in other
languages.
Criticism:Although Prolog is widely used in research and education, Prolog and other logic programming
languages have not had a significant impact on the computer industry in general. Most
applications are small by industrial standards, with few exceeding 100,000 lines of code.
Programming in the large is considered to be complicated because not all Prolog compilers
support modules, and there are compatibility problems between the module systems of the major
Prolog compilers.

Extending Prolog rules :


Using the preceding clause-form conversion for Prolog rules lets us give meaning to new kinds
of rules, rules not legal in Prolog. For instance this "pseudo-Prolog"
(a; b) :- c.
which means that either of a or b is true whenever c is true, becomes in clause form
a; b; not(c).
And this pseudo-Prolog
not(a) :- b.
which means a is false whenever b is true, becomes
not(a); not(b).
Notice that the first clause-form formula has two unnegated expressions, and the second has no
unnegated expressions. In general, any Prolog rule without nots becomes a clause form having
one and only one unnegated expression, what's called a Horn clause.
Clause form for a rule can require more than one "or"ed formula. As a more complicated
example, consider this pseudo-Prolog
(a; (b, c)) :- d, not(e).
which has the logical equivalent
a; (b, c); not(d); e.
To get rid of the "and", we can use the distributive law for "and" over "or". This gives two
separate statements (clauses), each of which must be true:
a; b; not(d); e.
a; c; not(d); e.
And that's the clause form for the original rule.

Page 98

Principles of Programming Languages

Rewriting rules in clause form answers some puzzling questions of why rules sometimes seem
"and"ed together and other times "or"ed together. Suppose we have two rules
a :- b.
a :- c.
The logical equivalent form is
(a; not(b)), (a; not(c)).
or:
a; (not(b), not(c)).
using the distributive law of "and" over "or". This can be rewritten as a single rule
a :- (b;c).
using DeMorgan's Law. So an "and" in the one sense--the "and" of the logical truth of separate
rules--is an "or" in another--the "or" of the right sides of rules with the same left side.

Resolution :Resolution is an inference technique that takes two clauses as input, and produces a clause as
output. The output clause, the resolvent, represents a true statement consistent with the input
clauses, the result of resolving them. In other words, the resolvent is one conclusion we can
draw. If the resolvent is a fact, then we've proved a fact. If the resolvent is the clause consisting
of no expressions, the null clause, we've proved a contradiction. Resolution is particularly
efficient for proof by contradiction: we assume the opposite of some statement we wish to prove,
and see if we can prove the null clause from it.
Resolution requires pairs of opposites in the two input clauses. That is, one input clause must
contain a predicate expression--call it P--for which not(Q) occurs in the other input clause and
where P can match Q by binding variables as necessary. (Formally, P matches Q if the
expression P=Q can succeed.) Then the resolvent of the two input clauses is the "or" of
everything besides P and not(Q) in the two clauses, eliminating any duplicate expressions. We
say that the P and the not(Q) "cancel". For instance, if the input clauses are
a; b; not(c); d.
e; not(b); a; f.
then the resolvent (output) clause is
a; not(c); d; e; f.
where we eliminated the opposites b and not(b) and a duplicate a fact.

Resolution with variables :When predicates have variable arguments, resolution becomes a little more complicated: we still
look for a pair of opposites, but Prolog-style binding of the variables can be done to make the
canceling expressions "match". As with Prolog, bindings made to variables apply to any other
occurrences of the variables within their original clauses, so if a p(X) in the first input clause
matches a p(4) in the second input clause, any other X in the first clause becomes a 4. Variables
can also be bound to other variables. Important note: it's essential that each input clause have
different variable names before resolving.
Page 99

Principles of Programming Languages

Here's an example of resolution with variables. Suppose the two clauses are
a(3); b(Y); not(c(Z,Y)).
not(a(W)); b(dog); c(W,cat).
The a expressions can cancel with W bound to 3, giving:
b(Y); not(c(Z,Y)); b(dog); c(3,cat).
The b(dog) is redundant with b(Y), so we can improve this clause to:
b(Y); not(c(Z,Y)); c(3,cat).
But we could resolve the original two clauses another way. The c expressions could cancel, with
Z being bound to W and with Y being bound to cat, giving:
a(3); b(cat); not(a(W)); b(dog).
This is a completely different resolvent, representing a different conclusion possible from the
two clauses. Notice that we can't eliminate anything here; b(cat) and b(dog) aren't redundant,
nor are a(3) and not(a(W)).
Note that bindings are transitive: if A is bound to 9, and B is bound to A, then B is bound to 9
too. So several reasoning steps may be necessary to determine a variable binding.

62. Logical limitations of Prolog


Prolog can do many things. But it has four fundamental logical weaknesses:
1. Prolog doesn't allow "or"d (disjunctive) facts or conclusions--that is, statements that one of
several things is true, but you don't know which. For instance, if a light does not come on when
we turn on its switch, we can conclude that either the bulb is burned out or the power is off or the
light is disconnected.
2. Prolog doesn't allow "not" (negative) facts or conclusions--that is, direct statements that
something is false. For instance, if a light does not come on when we turn on its switch, but
another light in the same room comes on when we turn on its switch, we can conclude that it is
false that there is a power failure.
3. Prolog doesn't allow most facts or conclusions having existential quantification--that is,
statements that there exists some value of a variable, though we don't know what, such that a
predicate expression containing it is true.

Page 100

Principles of Programming Languages

Unit-3

63.Discuss about string representations in different scripting languages

Strings in Perl:-

Strings are scalar. There is no limit to the size of the string, any amount of characters, symbols,
or words can make up your strings.
In Perl strings can be formatted to your liking using formatting characters. Some of these
characters also work to format files created in PERL.
Character Description:
\L
Transform all letters to lowercase
\l
Transform the next letter to lowercase
\U
Transform all letters to uppercase
\u
Transform the next letter to uppercase
\n
Begin on a new line
\b

Backspace

\0nn
\xnn

Creates Octal formatted numbers


Creates Hexideciamal formatted numbers

Formattingcharacters:
$mystring = "welcome to tizag.com!"; #String to be formatted
$newline = "welcome to \ntizag.com!";
$capital = "\uwelcome to tizag.com!";
$ALLCAPS = "\Uwelcome to tizag.com!";
# PRINT THE NEWLY FORMATTED STRINGS
print $mystring."<br />";
print $newline."<;br />";
print $capital."<br />";

Page 101

Principles of Programming Languages

print $ALLCAPS";

Strings in ASP:-To create an ASP String you first declare a variable that you wish to
store the string into. Next, set that variable equal to some characters that are encapsulated
within quotations, this collection of characters is called a String.

ASPConcatenatingStrings:
Throughout your ASP programming career you will undoubtedly want to combine multiple
strings into one. For example: you have someone's first and last name stored in separate variables
and want to print them out as one string. In ASP you concatenate strings with the use of the
ampersand (&) placed between the strings which you want to connect.

ASP Code:
<%
Dim fname, lname, name
fname = "Teddy"
lname = " Lee"
name = fname & lname
Response.Write("Hello my name is " & name)
%>

Display:
Hello my name is Teddy Lee

VBScript Strings:Strings are a bunch of alpha-numeric characters grouped together into a "string" of
characters.
VBScriptStringSyntax:
"Hello I am a string!"

VBScriptString:SavingintoVariables:

Page 102

Principles of Programming Languages

Just like numeric values, strings in VBScript are saved to variables using the equal operator.
This example shows how to save the string "Hello there!" into the variable myString and then use
that variable with the document.write function.

VBScript Code:
<script type="text/vbscript">
Dim myString
myString = "Hello there!"
document.write(myString)
</script>

Display:
Hello there!
VBScriptStringConcatenation:
Often it is advantageous to combine two or more strings into one. This operation of adding a
string to another string is referred to as concatenation. The VBScript script concatenation
operator is an ampersand "&" and occurs in between the two strings to be joined.

VBScript Code:
<script type="text/vbscript">
Dim myString
myString = "Hello there!"
myString = myString & " My name"
myString = myString & " is Frederick"
myString = myString & " Nelson."
document.write(myString)
</script>

Display:
Hello there! My name is Frederick Nelson.

Java Script Strings:

The most common operations performed with strings are concatenation. Concatenation is a
process of combining two strings into one longer string. In JavaScript "+" operator is used to
concatenate two strings. This symbol is also used as a mathematical addition operator in

Page 103

Principles of Programming Languages

JavaScript. So if you are using the "+" with numerical values it will add the two values; if you
use the same operator with two strings, it will concatenate (combine into one) two strings.
The following shows a simple example:
<script language="javascript">
var string1, string2, stringConcatenated;
string1 = "Concatenating "; // first string
string2 = "strings"; // second string
stringConcatenated = string1 + string2; // Concatenating strings
document.write (stringConcatenated); // printing the Concatenated string
</script>
In the above example, we use three string variables. Remember a string is surrounded by either
single or double quotation marks. On line 5, stringConcatenated = string1 + string2;, we use the
"+" operator to concatenate two strings. Before the concatenation, we had two strings:
"Concatenating " and "strings." After the concatenation, we end up with a combined longer
string: "Concatenating strings."

64.Innovative Features of Scripting Languages


A) Names and Scope:
Most scripting languages do no require variables to be declared. A few languages, notably Perl
and JavaScript, permit optional declarations, primarily as a sort of compiler checked
documentation. Perl can be run in a mode that requires declarations.
With or without declarations, most scripting languages use dynamic typing. Values are generally
self-descriptive, so the interpreter can perform type checking at run time.
Nesting and scoping conventions vary quite a bit. Scheme, Python, JavaScript, and R provide the
classic combination of nested subroutines and static scope. Tcl allows subroutines to nest, but
uses dynamic scoping. Named subroutines do not nest in PHP or Ruby, and they only sort of nest
in Perl, but Perl and Ruby join Scheme, Python, JavaScript, and R in providing first class
anonymous local subroutines. Nested blocks are statically scoped in Perl. In Ruby they are part
of the named scope in which they appear.

Page 104

Principles of Programming Languages

A program to illustrate scope rules in Python:

In Perl all variables are global unless explicitly declared. Perl has evolved over the years. At first
there were only global variables .Locals were soon added for the sake of modularity. Any
variable that is not declared is global in Perl by default. Variables declared with the local
operator are dynamically scoped. Variables declared with the my operator are statically scoped

B) String and Pattern Manipulation:


Regular Expressions:

Regular expression (or regex for short)Regex's are patterns that can be matched against a
string. It's basically just a template that is applied over a string that is to be scanned.
A simple regex where we look for abc in a string:

while (<>) {

if (/abc/) {

print $_;

}
Character Classes are very important in regex's. Character classes contain characters in
which each character must match at least one time in the given string. For example:

/[aeiouAEIOU]/
Perl has special additions or pre-defined character classes:
Page 105

Principles of Programming Languages

\d
# equivalent to [0-9]
\D
# equivalent to [^0-9]
\w
# equivalent to [a-zA-Z0-9_]
\W
# equivalent to [^a-zA-Z0-9_]
\s
# equivalent to [ \r\t\n\f] (whitespace)
\S
# equivalent to [^ \r\t\n\f]
Multipliers allow you to repeat matches in any given regex:

/abc*/

/a+bc/

/ab?c/
In Perl, you can use parentheses as memory, or to repeat the same regex within a regex:

/a(.*)b\1c/
Alternates allows you to match a variety of regex's. You use the pipe character to
separate patterns:

/abc|jkl|xyz/
Would match "abc", "jkl", or "xyz".

Anchoring allows you to make sure that the specified pattern matches up with specific
parts of the string. For example:

/\bmo/ # Matches anything starting with mo

/^mo/
# same thing

/mo\b/ # Matches anything that ends with mo

/mo$/
# same thing
/B is the opposite of /b, matches when there is no word boundary.

C) Data Types:
A data type refers to the type of data a variable can store.Scripting languages dont generally
require the declaration of types for variables.Most perform extensive run-time checks to make
sure that values are never used in inappropriate ways.
1. Numeric Types:
a) PHP Data Types:
PHP has different data types you can work with. These are: integer numbers, floating point
numbers
PHP is an loosely-typed language, so a variable does not need to be of a specific type and can
freely move between types as demanded by the code it is being used in.

Page 106

Principles of Programming Languages

Integers: An integer is a whole number. That is to say, it is a number with no fractional


component. For those who are familiar with C, a PHP integer is the same as the long data type in
C. It is a number between -2,147,483,648 and +2,147,483,647.
Integers can be written in decimal, octal, or hexadecimal. Decimal numbers are a string of digits
with no leading zeros. Any integer may begin with a plus sign ( + ) or a minus sign ( - ) to
indicate whether it is a positive or negative number. If there is no sign, then positive is assumed.

Valid decimal integers:


1
123
+7
-1007395
An octal number is a base-8 number. Each digit can have a value between zero (0) and seven (7).
Octal numbers are commonly used in computing because a three digit binary number can be
represented as a one digit octal number. Octal numbers are preceded by a leading zero (e.g.,
0123).
Valid octal integers:
01
0123
+07
-01007345
A hexadecimal number is a base-16 number. Each digit can have a value between zero (0) and F.
Since we only have ten numbers in our numbering system (0-9), we use the letters A through F
to make up the difference for hexadecimal values. Hexadecimal values are common in
computing because each digit represents 4 binary numbers, which is four bits. Eight bits, or a
two digit hexadecimal numer, is one byte. Hexadecimal numbers are preceded by a leading zero
and X (e.g., 0x123).
Valid hexadecimal integers:
0x1
Page 107

Principles of Programming Languages

0xff
0x1a3
+0x7
Floating Point numbers: Floating-point numbers are also sometimes called real numbers. They
are numbers that have a fractional component. Unlike basic math, all fractions are represented as
decimal numbers. If you are familiar with C, PHP floating-point numbers are equivalent to the
double data type. PHP recognizes two types of floating point numbers. The first is a simple
numeric literal with a decimal point. The second is a floating-point number written in scientific
notation. Scientific notation takes the form of [number]E[exponent], for example, 1.75E-2.
Some examples of valid floating point numbers include:
3.14
0.001
-1.234

0.314E2 // 31.4
1.234E-5 // 0.00001234
-3.45E-3 // -0.00345

b)

JavaScript Data Types

numbers

Numbers
Numbers are the easist of the data types to understand. They represent numeric values. The
simplest type of number is an integer.
All of the following are valid integers.
0
-75
10000000
2
Page 108

Principles of Programming Languages

Another commonly used type of number is a floating-point number. A floating-point number is a


number that is assumed to have a decimal point.
All of the following are valid floating-point numbers.
5.00
1.3333333333333
1.0e25
7.5e-3
C) VBScript Data Types

VBScript has only one data type called a Variant. A Variant is a special kind of data type that
can contain different kinds of information, depending on how it's used. Because Variant is the
only data type in VBScript, it's also the data type returned by all functions in VBScript.
At its simplest, a Variant can contain either numeric or string information. A Variant behaves
as a number when you use it in a numeric context and as a string when you use it in a string
context. That is, if you're working with data that looks like numbers, VBScript assumes that it is
numbers and does the thing that is most appropriate for numbers. Similarly, if you're working
with data that can only be string data, VBScript treats it as string data. Of course, you can always
make numbers behave as strings by enclosing them in quotation marks (" ").
The following table shows the subtypes of data that a Variant can contain.
Subtype

Description

Empty

Variant is uninitialized. Value is 0 for numeric variables or a zero-length string ("")


for string variables.

Null

Variant intentionally contains no valid data.

Boolean

Contains either True or False.

Byte

Contains integer in the range 0 to 255.

Integer

Contains integer in the range -32,768 to 32,767.

Currency

-922,337,203,685,477.5808 to 922,337,203,685,477.5807.

Page 109

Principles of Programming Languages

Long

Contains integer in the range -2,147,483,648 to 2,147,483,647.

Single

Contains a single-precision, floating-point number in the range -3.402823E38 to 1.401298E-45 for negative values; 1.401298E-45 to 3.402823E38 for positive values.

Double

Contains a double-precision, floating-point number in the range 1.79769313486232E308 to -4.94065645841247E-324 for negative values;
4.94065645841247E-324 to 1.79769313486232E308 for positive values.

Date (Time)

Contains a number that represents a date between January 1, 100 to December 31,
9999.

String

Contains a variable-length string that can be up to approximately 2 billion characters


in length.

2. Composite Types :
All composite data types can be treated as objects, but we normally categorize them by their
purpose as a data type. For composite data types we will look at objects, including some special
pre-defined objects that JavaScript provides, as well as functions and arrays.

Objects-An object is a collection of named values, called the properties of that object.
Functions associated with an object are referred to as the methods of that object.

Functions

A function is a piece of code, predefined or written by the person creating the JavaScript, that is
executed based on a call to it by name.

Arrays -An Array is an ordered collection of data values.


In some programming languages, arrays have very specific limitations. In JavaScript, an
array is just an object that has an index to refer to its contents.

The array index is included in square brackets immediately after the array name. In JavaScript,
the array index starts with zero, so the first element in an array would be arrayName[0], and the
third would be arrayName[2]. JavaScript does not have multi-dimensional arrays .
Perl the oldest of the widely used scripting languages,inherits its principal composite types-the
array and the hash-from awk.It also uses prefix characters on variable names as an indication of
type:$foo is a scalar (a number,Boolean,string,or pointer);@foo is an array;%foo is a hash;&foo
is a subroutine;and plain foo is a filehandle or an I/O format.
Page 110

Principles of Programming Languages

Ordinary arrays in perl are indexed using square brackets and integers starting with 0:
@colors = (red, green, blue);
Print $colors[1];
Hashes are indexed using curly braces and character string names:
%complements =(red => cyan, green => magenta, blue => yellow);
Print $complements{blue};
Python and Ruby provide both conventional arrays and hashes. They use square brackets for
indexing in both cases.
Colors = [red, green, blue]
Complements = {red => cyan, green => magenta, blue => yellow}
Print colors[2],complements[blue]
(This is Ruby syntax;Python uses : in place of =>)

D) Object Orientation features of scripting languages

JavaScript and Object Oriented Programming (OOP)

JavaScript takes the unusual approach of providing objects-with the inheritance and dynamic
method dispatch-without providing classes.
JavaScript is an excellent language to write object oriented web applications. It can support OOP
because it supports inheritance through prototyping as well as properties and methods.
JavaScript gives the ability to make our own objects for our own applications. With our objects
we can code in events that fire when we want them to, and the code is encapsulated. It can be
initialized any amount of times.
Creating objects using new Object()
There are several ways to create objects in JavaScript, and all of them have their place. The
simplest way is to use the new operator, specifically, new Object():
<script language="javascript" type="text/javascript">
<!-person = new Object()
person.name = "Tim Scarfe"
person.height = "6Ft"
person.run = function() {
this.state = "running"
this.speed = "4ms^-1"
}
Page 111

Principles of Programming Languages

//-->
</script>
We define a custom object "person," then add to it its own properties and method afterwards. In
this case, the custom method merely initializes two more properties.

PHP

PHP 4 provided a variety of object oriented features, which were heavily revised in PHP 5.The
newer version of the language provides a reference model of variables, interfaces and mix in
inheritance, abstract methods and classes, final methods and classes, static and constant
members, and access control specifiers.
To define a class in PHP, we write the following code:
class myClass
{
var $attribute1;
var $attribute2;
function method1()
{
// Code here
return $something;
}
function method2()
{
// Code here
}
}
To instantiate our class (ie. create an instance), we need to use the new keyword
$myObject = new myClass();
This creates an object. An object is, in effect, a new variable with a user-defined data-type
To access the various attributes and methods (members) of our object, we can use the namespace
separator -> (sometimes called the arrow operator; although its not really an operator). When
Page 112

Principles of Programming Languages

you are accessing a variable in PHP, the part of code after the $ is known as the namespace.
When you are accessing attributes and methods that are members of an object, you need to
extend that namespace to include the name of your object. To do this, you reference attributes
and methods like so:
// Attributes
$myObject->attribute1 = 'Sonic';
$myObject->attribute2 = 'Knuckles';
// Methods
$returned = $myObject->method1();
$myObject->method2();

Python and ruby

In both Python and Ruby each class has a single distinguished constructor, which cannot be
overloaded. In Python it is _ _init_ _; in Ruby it is initialize. To create a new object in Python
one says my_object = My_class(args); In Ruby one says my_object = My_class.new(args). In
each case the args are passed to the constructor. To achieve the effect of overloading, with
different numbers or types of arguments, one must arrange for the single constructor to inspect
its arguments explicitly.
New fields can be added to a Python object simply by assigning to them:
my_object.new_field = value.
In Ruby only methods are visible outside a class, and all methods must be explicitly declared. It
is possible, however, to modify an existing class declaration, adding or overriding methods.
Python and Ruby differ in many other ways. The initial parameter to methods is explicit in
Python; by convention it is usually named self. In Ruby self is a keyword, and the parameter it
represents is invisible. Any variable beginning with a single @ sign in Ruby is a field of the
current object. Within a Python method, uses of object members must name the object explicitly.
Ruby methods may be public, protected, or private. Access control in Python is purely a matter
of convention; both methods and fields are universally accessible. Python has multiple
inheritance. Ruby has mix-in inheritance: a class cannot obtain data from more than one
ancestor. Ruby allows an interface to define not only the signature of methods, but also their
implementation.

65.Concurrency
A program is said to be concurrent if it may have more than one active execution context-more
than one thread of control. Concurrency has at least three important motivations:
1. To capture the logical structure of a problem- Many programs, particularly servers and
graphical applications, must keep track of more than one largely independent task at the
same time. Often the simplest and most logical way to structure such a program is to
represent each task with a separate thread of control.

Page 113

Principles of Programming Languages

2. To exploit extra processors, for speed- Long a staple of high end


supercomputers, multiple processors have recently become ubiquitous in
laptop machines. To use them effectively, programs must generally be
concurrency in mind.
3. To cope with separate physical devices- Applications that run across the
more local group of machines are inherently concurrent.

servers and
desktop and
written with
internet or a

A concurrent system is parallel if more than one task can be physically active at once;
this requires more than one processor.

Communication and Synchronization: In any concurrent programming model, two of the


most crucial issues to be addressed are communication and synchronization. Communication
refers to any mechanism that allows one thread to obtain information produced by another.
Communication mechanisms for imperative programs are generally based on either shared
memory or message passing. In a shared memory programming model some or all of a
programs variables are accessible to multiple threads. For a pair of threads to communicate, one
of them writes a value to a variable and the other simply reads it.
Synchronization refers to any mechanism that allows the programmer to control the relative
order in which operations occur in different threads. Synchronization is generally implicit in
message passing models, a message must be sent before it can be received. If a thread attempts to
receive a message that has not yet been sent, it will wait for the sender to catch up.
Synchronization is generally not implicit in shared memory models: unless we do something
special, a receiving thread could read the old value of a variable before it has been written by the
sender.
Thread Creation Syntax: Almost every concurrent system allows threads to be created
dynamically. Syntactic and semantic details vary considerably from one language or library to
another, but most conform to one of six principal options: co-begin, parallel loops, launch-atelaboration, fork, implicit receipt, and early reply. The first two options delimit threads with
special control flow constructs. The others use syntax resembling subroutines.
Co-begin: The usual semantics of a compound statement call for sequential execution of the
constituent statements. A co-begin construct calls instead for concurrent execution:
co-begin
stmt_1
stmt_2
.
stmt_n
end
Page 114

Principles of Programming Languages

Each statement can itself be a sequential or parallel compound, or a subroutine call.


Co-begin was the principal means of creating threads in Algol-68. It appears in a variety of other
systems as well, including OpenMP:
#pragma omp sections
{
# pragma omp section
{
printf (thread 1 here \n);
}
# pragma omp section
{
printf (thread 2 here \n);
}
}

Parallel Loops: Many concurrent systems, including OpenMP, several dialects of FORTRAN,
and the recently announced parallel FX library for .NET, provide a loop whose iterations are to
be executed concurrently. A parallel loop in OpenMP:
#pragma omp parallel for
for (int i=0;i<3;i++)
{
printf (thread %d here \n, i);
}

Parallel loop in C#:


Parallel.for(0, 3, i=>{
Console.WriteLine(Thread + i + here);
});

Launch-at-Elaboration: In several languages, Ada among them, the code for a thread may be
declared with syntax resembling that of a subroutine with no parameters. When the declaration is
elaborated, a thread is created to execute the code. In Ada we may write
Page 115

Principles of Programming Languages

procedure P is
task T is

end T;
begin P
..
end P;

Fork/Join: The fork operation is more general: it makes the creation of threads an explicit,
executable operation. The companion join operation, when provided , allows a thread to wait
for the completion of a previously forked thread.

66.Implementation of Threads
The threads of a concurrent program are usually implemented on top of one or more processes
provided by the operating system. At one extreme, we could use a separate OS process for every
thread; at the other extreme, we could multiplex all of a programs threads on top of a single
process. On a supercomputer with a separate processor for every concurrent activity, or in a
language in which threads are relatively heavyweight abstractions, the one process per thread
extreme is often acceptable. In a simple language on a uniprocessor, the all threads on one
process extreme may be acceptable. Many language implementations adopt an intermediate
approach, with a potentially very large number of threads running on top of some smaller
number of processes.
The problem with putting every thread on a separate process is that processes are simply too
expensive in many operating systems. Because they are implemented in the kernel, performing
any operation on them requires a system call.
There are two problems with putting all threads on top of a single process: first,it precludes
parallel execution on a multicore or multiprocessor machine; second, if the currently running
thread makes a system call that blocks, then none of the programs other threads can run, because
the single process is suspended by the OS.
In the common two-level organization of concurrency, similar code appears at both levels of the
system: the language run time system implements threads on top of one or more processes in
much the same way that the operating system implements processes on top of one or more
physical processors.

Page 116

Principles of Programming Languages

[ Two level implementation of threads]


Uniprocessor Scheduling: At any particular time, a thread is either blocked or runnable. A
runnable thread may actually be running on some process or it may be awaiting its chance to do
so. Context blocks for threads that are runnable but not currently running reside on a queue
called the ready list. Context blocks for threads that are blocked for scheduler-based
synchronization reside in data structures associated with the conditions for which they are
waiting. To yield the processor to another thread, a running thread calls the scheduler:
procedure reschedule
t : thread := dequeue(ready_list)
transfer(t)
Before calling into the scheduler, a thread that wants to run again at some point in the future
must place its own context block in some appropriate data structure. If it is blocking for the sake
of fairness-to give some other thread a chance to run- then it enqueues its context block on the
ready list:
procedure yield
enqueue (ready_list, current_thread)
reschedule

Page 117

Principles of Programming Languages

[Data structures of a simple scheduler]


To block for synchronization, a thread adds itself to a queue associated with the awaited
condition:
procedure sleep_on(ref Q : queue of thread)
enqueue(Q,current_thread)
reschedule
When a running thread performs an operation that makes a condition true,it removes one or more
threads from the associated queue and enqueues them on the ready list.

67.Implementing Synchronization
Synchronization serves either to make some operation atomic or to delay that operation until
some necessary precondition holds. Atomicity is most commonly achieved with mutual
exclusion locks. Mutual exclusion ensures that only one thread is executing some critical section
of code at a given point in time. Critical sections typically transform a shared data structure from
one consistent state to another.
Condition synchronization allows a thread to wait for a precondition, often expressed as a
predicate on the value in one or more shared variables. It is tempting to think of mutual exclusion
as a form of condition synchronization, but this sort of condition would require consensus among
all extant threads, something that condition synchronization doesnt generally provide.
Busy-wait Synchronization: Busy-wait condition synchronization is easy if we can cast a
condition in the form of location X contains value Y: a thread that needs to wait for the
Page 118

Principles of Programming Languages

condition can simply read X in a loop, waiting for Y to appear. To wait for a condition involving
more than one location, one needs atomicity to read the locations together, but given that the
implementation is again a simple loop.
Scheduler-Based Synchronization: The problem with busy-wait synchronization is that it
consumes processor cycles, cycles that are therefore unavailable for other computation. Busywait synchronization makes sense only if (1). One has nothing better to do with the current
processor, or (2). the expected wait time is less than the time that would be required to switch
contexts to some other thread and then switch back again
Semaphores: Semaphores are the oldest of the scheduler-based synchronization mechanisms. A
semaphore is basically a counter with two associated operations, P and V.
A thread that calls P atomically decrements the counter and then waits until it is non-negative. A
thread that calls V atomically increments the counter and wakes up a waiting thread, if any. It is
generally assumed that semaphores are fair, in the sense that threads complete P operations in the
same order they start them.
A semaphore whose counter is initialized to 1 and for which P and V operations always occur in
matched pairs is known as a binary semaphore. It serves as a scheduler-based mutual exclusion
lock: the P operation acquires the lock, V releases it. More generally a semaphore whose counter
is initialized to K can be used to arbitrate access to K copies of some resource. The value of the
counter at any particular time is always K more than the difference between the number of P
operations and the number of V operations that have occurred so far in the program.
Synchronization in Ada 95: The principal mechanism for synchronization in Ada, introduced
in Ada 83, is based on message passing. A protected object can have three types of methods:
functions,procedures, and entries. Functions can only read the fields of the objects;procedures
and entries can read and write them. An implicit reader-writer lock on the protected object
ensures that potentially conflicting operations exclude one another in time: a procedure or entry
obtains exclusive access to the object; a function can operate concurrently with other functions,
but not with a procedure or entry.
Procedures and entries differ from one another in two important ways. First, an entry can have a
Boolean expression guard, for which the calling task will wait before beginning execution.
Second, an entry supports three special forms of call: timed calls, which abort after waiting for a
specified amount of time, conditional calls, which execute alternative code if the call cannot
proceed immediately, and asynchronous calls, which begin executing alternative code
immediately, but abort it if the call is able to proceed before the alternative completes.
Synchronization in Java: In Java every object accessible to more than one thread has an
implicit mutual exclusion lock, acquired and released by means of synchronized statements:
Synchronized (my_shared_ obj)
Page 119

Principles of Programming Languages

//critical section

}
All executions of synchronized statements that refer to the same shared object exclude one
another in time. Synchronized statements that refer to different objects may proceed
concurrently. A method of a class may be prefixed with the synchronized keyword, in which
case the body of the method is considered to have been surrounded by an implicit synchronized
(this) statement. Invocations of nonsynchronized method of a shared object- and direct accesses
to public fields-can proceed concurrently with each other, or with a synchronized statement or
method

68.Virtual Machines
VM is a software implementation of a machine that executes programs like a physical machine
A virtual machine provides a complete programming environment: its application programming
interface (API) includes everything required for correct execution of the programs that run above
it.
Every virtual machine API includes an instruction set architecture in which to express programs.
This may be the same as the instruction set of some existing physical machine, or it may be an
artificial instruction set designed to be easier to implement in software and to generate with a
compiler. Other portions of the VM API may support I/O, scheduling, or other services provided
by a library or by the operating system of a physical machine.
In practice, virtual machines tend to be characterized as either system VMs or process VMs. A
system VM faithfully emulates all the hardware facilities needed to run a standard OS, including
both privileged and unprivileged instructions, memory-mapped I/O, virtual memory, and
interrupt facilities. By contrast, a process VM provides the environment needed by a single user
level process: the unprivileged subset of the instruction set and a library level interface to I/O and
other services.
System VMs are sometimes called virtual machine monitors(VMM), because they multiplex a
single physical machine among a collection of guest operating systems-that is, they monitor the
execution of multiple virtual machines, each of which runs a separate guest OS. The first widely
available VMM was IBMs CP/CMS. Rather than building an OS capable of supporting multiple
users, IBM used the control program VMM to create a collection of virtual machines, each of
which ran a lightweight, single user OS.
Process VMs were originally conceived as a way to increase program portability and to quickly
bootstrap languages on new hardware
Page 120

Principles of Programming Languages

The Java Virtual Machine: JVM is a virtual machine capable of executing java byte code (byte
code is the platform independent instructions needed for a JVM).
Source code
JVC.exe
Class file
JVM

The first public release of java occurred in 1995. At that time code in the JVM was entirely
interpreted. A JIT compiler was added in 1998 with the release of java2. Suns javac compiler
and HotSpot JVM are most widely used combination, and the most complete.
Architecture Summary- The interface provided by the JVM was designed to be an attractive
target for a java compiler. It provides direct support for all the built in and reference types
defined by the java language. It includes built in support for many of javas language features
and standard library packages, including exceptions, threads, garbage collection, reflection,
dynamic loading and security.
Of course, nothing requires that java byte code be produced from java source. Compilers
targeting the JVM exist for many other languages, including Ruby, JavaScript, Python, and
Scheme, as well as C, Ada, Cobol, and others, which are traditionally compiled. There are even
assemblers that allow programmers to write JBC directly. The principal requirement, for both
compilers and assemblers, is that they generate correct class files. These have a special format
understood by the JVM, and must satisfy a variety of structural and semantic constraints.
At start-up time, a JVM is typically given the name of a class file containing the static method
main. It loads this class into memory, verifies that it satisfies a variety of required constraints,
allocates any static fields, links it to any preloaded library routines, and invokes any initialization
code provided by the programmer for classes or static fields. Finally, it calls main in a single
thread. Additional classes may be created via calls to the methods of class Thread.
Storage Management: Storage allocation mechanisms in the JVM mirror those of the java
language. There is a global constant pool, a set of registers and a stack for each thread, a method
area to hold executable byte code, and a heap for dynamically allocated objects.
a) Global data- The method area is analogous to the code segment of a traditional
executable file. The constant pool contains both program constants and a variety of
symbol table information needed by the JVM and other tools. Like the code of methods,
the constant pool is read only to user programs. Each entry begins with a one byte tag that
Page 121

Principles of Programming Languages

indicates the kind of information contained in the rest of the entry. Possibilities include
the various built-in types; character string names, and class method, and field references.
class Hello
{
Public static void main(String args [])
{
System.out.println (Hello, World !);
}
};
b) Per thread data A program running on the JVM begins with a single thread.
Additional threads are created by allocating and initializing a new object of the build- in
class Thread, and then calling its start method. Each thread has a small set of base
registers, a stack of method call frames, and an optional traditional stack on which to call
native methods.
Each frame on the method call stack contains an array of local variables,an operand stack
for evaluation of the methods expressions, and a reference into the constant pool that
identifies information needed for dynamic linking of called methods. Space for formal
parameters is included among the local variables. Variables that are not live at the same
time can share a slot in the array; this means that the same slot may be used at different
times for data of different types.
c) Heap- In keeping with the type system of the java language, a datum in the local variable
array or the operand stack is always either a reference or a value of a built in scalar type.
Structured data must always lie in the heap. They are allocated, dynamically, using the
new and newarray instructions. They are reclaimed automatically via garbage collection.
d) Class Files Physically, a JVM class file is stored as a stream of bytes. Typically these
occupy some real file provided by the operating system, but they could just as easily be a
record in a database. On many systems, multiple class files may be combined into a java
archive (.jar) file.

Verification Safety was one of the principal concerns in the definition of the java language
and virtual machine. Many of the things that can go wrong while executing machine code
compiled from a more traditional language cannot go wrong when executing byte code
compiled from java. Some aspects of safety are obtained by limiting the expressiveness of
the byte-code instruction set or by checking properties at load time. One cannot jump to a
nonexistent address, for example, because method calls specify their targets symbolically by
name, and branch targets are specified as indices within the code attribute of the current
method. Similarly, where hardware allows displacement addressing from the frame pointer to
Page 122

Principles of Programming Languages

access memory outside the current stack frame, the JVM checks at load time to make sure
that references to local variables are within the bounds declared.
Other aspects of safety are guaranteed by the JVM during execution. Field access and
method call instructions throw an exception if given a null reference. Similarly, array load
and store instructions throw an exception if the index is not within the bounds of the array

69.Late Binding of Machine Code


In the traditional conception, compilation is a one time activity, sharply distinguished from
program execution. The compiler produces a target program, typically in machine language,
which can subsequently be executed many times for many different inputs.
Just-in-Time and Dynamic Compilation: JIT compilation translates a program from source or
intermediate form into machine language immediately before each separate run of the program.
Just in-time compilation is, a technique to retain the portability of byte code while improving
execution speed. JIT system compiles programs immediately prior to execution, it can add
significant delay to program start up time. Implementors face a difficult tradeoff: to maximize
benefits with respect to interpretation, the compiler should produce good code; to minimize startup time, it should produce that code very quickly. In general, JIT compilers tend to focus on the
simpler forms of target code improvement. Specifically, they often limit themselves to so called
local improvements which operate within individual control flow constructs. Improvements at
the global level are usually too expensive to consider.
Fortunately, the cost of JIT compilation is typically lessened by the existence of an earlier source
to-byte-code compiler that does much of the heavy lifting. Scanning is unnecessary in a JIT
compiler, since byte code is not textual. Parsing is trivial, since class files have a simple, selfdescriptive structure. Many of the properties that a source to-byte-code compiler must infer at
significant expense are embedded directly in the structure of the byte code; others can be verified
with simple data flow analysis. Certain forms of machine independent code improvement may
also be performed by the source to byte code compiler.
Dynamic Compilation: In some cases, compilation must be delayed, either because the source
or byte code was not created or discovered until run time, or because we wish to perform
optimizations that depend on information gathered during execution. In these cases we say the
language implementation employs dynamic compilation. Common Lisp systems have used
dynamic compilation for many years: the language is typically compiled, but a program can
extend itself at run time. Optimization based on run- time statistics is a more recent innovation.
Most programs spend most of their time in a relatively small fraction of the code. Aggressive
code improvement on this fraction can yield disproportionately large improvements in program
Page 123

Principles of Programming Languages

performance. A dynamic compiler can use statistics gathered by run- time profiling to identify
hot paths through the code, which it then optimizes in the background. By rearranging the code
to make hot paths contiguous in memory, it may also improve the performance of the instruction
cache. Additional run- time statistics may suggest opportunities to unroll loops assign frequently
used expressions to registers and schedule instructions to minimize pipeline stalls. In some
situations, a dynamic compiler may even be able to perform optimizations that would be unsafe
if implemented statically.

70.Reflection
Reflection appears in several languages including Prolog and all the major scripting languages.
In a dynamically typed language such as Lisp, reflection is essential: it allows a library or
application function to type check its own arguments. In a statically typed language, reflection
supports a variety of programming idioms that were not traditionally feasible.
Reflection is useful in programs that manipulate other programs. Most program development
environments, for example, have mechanisms to organize and pretty-print the classes,
methods, and variables of a program. In a language with reflection, these tools have no need to
examine source code: if they load the already-compiled program into their own address space,
they can use the reflection API to query the symbol table information created by the compiler.
Interpreters, debuggers, and profilers can work in a similar fashion. In a distributed system, a
program can use reflection to create a general-purpose serialization mechanism, capable of
transforming an almost arbitrary structure into a linear stream of bytes that can be sent over a
network and reassembled at the other end.
There are dangers, of course, associated with the undisciplined use of reflection. Because it
allows an application to peek inside the implementation of a class, reflection violates the normal
rules of abstraction and information hiding. It may be disabled by some security policies.
The most common pitfall of reflection, at least for object-oriented languages, is the temptation to
write case (switch) statements driven by type information:
procedure rotate(s : shape)
case shape: type of
square: rotate_square(s)
triangle: rotate_triangle (s)
circle:
..
While this kind of code is common in Lisp, in an object oriented language it is much better
written with subtype polymorphism:
Page 124

Principles of Programming Languages

s.rotate ()

// virtual method call

Java 5 Reflection: Javas root class, object, supports a getClass method that returns an instance
of java.lang.Class . Objects of this class in turn support a large number of reflection operations.
A call to getName returns the fully qualified name of the class, as it is embedded in the package
hierarchy. For array types, naming conventions are taken from the JVM:
Int [ ] A = new int [10];
System.out.println (A.getclass( ).getName( ));
String [ ] c= new String [10];
System.out.println ( c.getClass( ).getName( ));
Foo [ ] [ ] D = new Foo [10] [10];
System.out.println (D.getClass ( ). getName ( ));
Here Foo is assumed to be a user defined class in the default package. A left square bracket
indicates an array type; it is followed by the arrays element type. The built in types are
represented in this context by single letter names. User defined types are indicated by an L,
followed by the fully qualified class name and terminated by a semicolon.
Other Languages: C#s reflection API is similar to that of java: System.Type is analogous to
java.lang.Class; System. Reflection is analogous to java.lang.reflect. The pseudofunction typeof
plays the role of javas pseudofield. class. The use of reflection instead of erasure for generics
means that we can retrieve precise information on the type parameters used to instantiate a given
object.
All of the major scripting languages provide extensive reflection mechanisms. The precise set of
capabilities varies some from language to language, and the syntax varies quite a bit, but all
allow a program to explore its own structure and types. From the programmers point of view,
the principal difference between reflection in java and C# on the one hand, and in scripting
languages on the other, is that the scripting languages like Lisp- are dynamically typed.

71.Features of FORTRAN
FORTRAN was introduced in 1957 at IBM by a team led by John Backus.
The IBM Mathematical Formula Translation System or briefly, FORTRAN, will comprise a
large set of programs to enable the IBM 704 to accept a concise formulation of a problem in
terms of a mathematical notation and to produce automatically a high-speed 704 program for the
solution of the problem.
FORTRAN did not eliminate programming; it was a major step towards the elimination of
assembly language coding.

Page 125

Principles of Programming Languages

The major achievements of FORTRAN are:


. efficient compilation;
. separate compilation (programs can be presented to the compiler as separate subroutines, but
the compiler does not check for consistency between components);
The principal limitations of FORTRAN are:
Flat, uniform structure- There is no concept of nesting in FORTRAN. A program consists of a
sequence of subroutines and a main program. Variables are either global or local to subroutines.
In other words, FORTRAN programs are rather similar to assembly language programs: the main
difference is that a typical line of FORTRAN describes evaluating an expression and storing its
value in memory whereas a typical line of assembly language specifies a machine instruction
Limited control structures- The control structures of FORTRAN are IF, DO, and GOTO. Since
there are no compound statements, labels provide the only indication that a sequence of
statements forms a group.
Unsafe memory allocation- FORTRAN borrows the concept of COMMON storage from
assembly language program. This enables different parts of a program to share regions of
memory, but the compiler does not check for consistent usage of these regions. One program
component might use a region of memory to store an array of integers, and another might assume
that the same region contains reals. To conserve precious memory, FORTRAN also provides the
EQUIVALENCE statement, which allows variables with different names and types to share a
region of memory.
No recursion- FORTRAN allocates all data, including the parameters and local variables of
subroutines, statically. Recursion is forbidden because only one instance of a subroutine can be
active at one time.
The parameter statement-Some constants appear many times in a program. It is then often
desirable to define them only once, in the beginning of the program. It also makes programs
more readable. For example, the circle area program should rather have been written like this:
program circle
real r, area, pi
parameter (pi = 3.14159)
write (*,*) 'Give radius r:'
read (*,*) r
area = pi*r*r
write (*,*) 'Area = ', area
stop
end
The syntax of the parameter statement is

Page 126

Principles of Programming Languages

parameter (name = constant, ... , name = constant)


The rules for the parameter statement are:

The "variable" defined in the parameter statement is not a variable but rather a constant
whose value can never change
A "variable" can appear in at most one parameter statement
The parameter statement(s) must come before the first executable statement

Some good reasons to use the parameter statement are:

it helps reduce the number of typos


it is easy to change a constant that appears many times in a program

72.Features of Algol 60
During the late fifties, most of the development of PLs was coming from industry. IBM
dominated,
with COBOL, FORTRAN, and FLPL (FORTRAN List Processing Language), all designed
for the IBM 704. Algol was designed by an international committee, partly to provide a PL that
was independent of any particular company and its computers.
The goal was a universal programming language. In one sense, Algol was a failure: few
complete,
high-quality compilers were written and the language was not widely used
An Algol Block:
begin
integer x;
begin
function f(x) begin ... end;
integer x;
real y;
x := 2;
y := 3.14159;
end;
x := 1;
end
The major innovations of Algol are discussed below:
Block Structure- Algol programs are recursively structured. A program is a block. A block
consists of declarations and statements. There are various kinds of statement; in particular, one
kind of statement is a block. A variable or function name declared in a block can be accessed
only within the block: thus Algol introduced nested scopes. The recursive structure of programs
means that large programs can be constructed from small programs.

Page 127

Principles of Programming Languages

The run-time entity corresponding to a block is called an activation record (AR). The AR is
created on entry to the block and destroyed after the statements of the block have been executed.
The syntax of Algol ensures that blocks are fully nested; this in turn means that ARs can be
allocated on a stack. Block structure and stacked ARs have been incorporated into almost every
language since Algol.
Dynamic Arrays- The designers of Algol realized that it was relatively simple to allow the size
of an array to be determined at run-time. The compiler statically allocates space for a pointer and
an integer (collectively called a dope vector) on the stack. At run-time, when the size of the
array is known, the appropriate amount of space is allocated on the stack and the components of
the dope vector are initialized. The following code works fine in Algol 60.
procedure average (n); integer n;
begin
real array a[1:n];
....
end;
Call By Name- The default method of passing parameters in Algol was call by name and it
was described by a rather complicated copy rule. The essence of the copy rule is that the
program behaves as if the text of the formal parameter in the function is replaced by the text of
the actual parameter. The complications arise because it may be necessary to rename some of the
variables during the textual substitution. The usual implementation strategy was to translate the
actual parameter into a procedure with no arguments.
Variables. A variable in an Algol procedure can be declared own. The effect is that the variable
has local scope (it can be accessed only by the statements within the procedure) but global
extent (its lifetime is the execution of the entire program).
Algol 60 and most of its successors, like FORTRAN, has a value semantics. A variable name
stands
for a memory addresses that is determined when the block containing the variable declaration is
entered at run-time.
Algol 60 was simple and powerful, but not quite powerful enough. The dominant trend after
Algol
was towards languages of increasing complexity, such as PL/I and Algol 68.

73.Features of COBOL
COBOL (Sammett 1978) introduced structured data and implicit type conversion. When COBOL
was introduced, programming was more or less synonymous with numerical computation.
COBOL introduced data processing, where data meant large numbers of characters. The data
division of a COBOL program contained descriptions of the data to be processed.
Another important innovation of COBOL was a new approach to data types. The problem of type
conversion had not arisen previously because only a small number of types were provided by the
PL. COBOL introduced many new types, in the sense that data could have various degrees of
precision, and different representations as text. The choice made by the designers of COBOL
was radical: type conversion should be automatic.
The assignment statement in COBOL has several forms, including
Page 128

Principles of Programming Languages

MOVE X to Y.
If X and Y have different types, the COBOL compiler will attempt to find a conversion from one
type to the other. In most PLs of the time, a single statement translated into a small number of
machine instructions. In COBOL, a single statement could generate a large amount of machine
code.
Automatic conversion in COBOL- The Data Division of a COBOL program might contain
these declarations:
77 SALARY PICTURE 99999, USAGE IS COMPUTATIONAL.
77 SALREP PICTURE $$$,$$9.99
The first indicates that SALARY is to be stored in a form suitable for computation (probably, but
not necessarily, binary form) and the second provides a format for reporting salaries as amounts
in dollars. (Only one dollar symbol will be printed, immediately before the first significant digit).
The Procedure Division would probably contain a statement like
MOVE SALARY TO SALREP.
which implicitly requires the conversion from binary to character form, with appropriate
formatting.

74.Features of PL/I
During the early 60s, the dominant languages were Algol, COBOL, FORTRAN. The continuing
desire for a universal language that would be applicable to a wide variety of problem domains
led IBM to propose a new programming language (originally called NPL but changed, after
objections from the UKs National Physical Laboratory, to PL/I) that would combine the best
features of these three languages. Insiders at the time referred to the new language as
CobAlgoltran.
The design principles of PL/I (Radin 1978) included:
. the language should contain the features necessary for all kinds of programming;
. a programmer could learn a subset of the language, suitable for a particular application, without
having to learn the entire language.
An important lesson of PL/I is that these design goals are doomed to failure. A programmer who
has learned a subset of PL/I is likely, like all programmers, to make a mistake. With luck, the
compiler will detect the error and provide a diagnostic message that is incomprehensible to the
programmer because it refers to a part of the language outside the learned subset. More probably,
the compiler will not detect the error and the program will behave in a way that is inexplicable to
the programmer, again because it is outside the learned subset.
PL/I extends the automatic type conversion facilities of COBOL to an extreme degree. For
example,
the expression (Gelernter and Jagannathan 1990)
(57 || 8) + 17
is evaluated as follows:
1. Convert the integer 8 to the string 8.
2. Concatenate the strings 57 and 8, obtaining 578.
3. Convert the string 578 to the integer 578.
4. Add 17 to 578, obtaining 595.
5. Convert the integer 595 to the string 595.
Page 129

Principles of Programming Languages

The compilers policy, on encountering an assignment x = E, might be paraphrased as: Do


everything possible to compile this statement; as far as possible, avoid issuing any diagnostic
message that would tell the programmer what is happening.
PL/I did introduce some important new features into PLs. They were not all well-designed, but
their existence encouraged others to produce better designs.
. Every variable has a storage class: static, automatic, based, or controlled. Some of
these were later incorporated into C.
. An object associated with a based variable x requires explicit allocation and is placed on the
heap rather than the stack. Since we can execute the statement allocate x as often as necessary,
based variables provide a form of template.
. PL/I provides a wide range of programmer-defined types. Types, however, could not be named.
. PL/I provided a simple, and not very safe, form of exception handling. Statements of the
following form are allowed anywhere in the program:
ON condition
BEGIN;
....
END;
If the condition (which might be OVERFLOW, PRINTER OUT OF PAPER, etc.) becomes
TRUE,
control is transferred to whichever ON statement for that condition was most recently executed.
After the statements between BEGIN and END (the handler) have been executed, control returns
to the statement that raised the exception or, if the handler contains a GOTO statement, to the
target of that statement.

75.Features of Algol 68
Whereas Algol 60 is a simple and expressive language, its successor Algol 68 is much more
complex. The main design principle of Algol 68 was orthogonality: the language was to be
defined using a number of basic concepts that could be combined in arbitrary ways. Although it
is true that lack of orthogonality can be a nuisance in PLs, it does not necessarily follow that
orthogonality is always a good thing.
The important features introduced by Algol 68 include the following.
. The language was described in a formal notation that specified the complete syntax and
semantics of the language (van Wijngaarden et al. 1975). The fact that the Report was very hard
to understand may have contributed to the slow acceptance of the language.
. Operator overloading: programmers can provide new definitions for standard operators such as
+. Even the priority of these operators can be altered.
. Algol 68 has a very uniform notation for declarations and other entities. For example, Algol 68
uses the same syntax (mode name = expression) for types, constants, variables, and functions.
This implies that, for all these entities, there must be forms of expression that yield appropriate
values.
. In a collateral clause of the form (x, y, z), the expressions x, y, and z can be evaluated in any
order, or concurrently. In a function call f(x, y, z), the argument list is a collateral clause.
Collateral clauses provide a good, and early, example of the idea that a PL specification should
intentionally leave some implementation details undefined. In this example, the Algol 68 report
Page 130

Principles of Programming Languages

does not specify the order of evaluation of the expressions in a collateral clause. This gives the
implementor freedom to use any order of evaluation and hence, perhaps, to optimize.
. The operator ref stands for reference and means, roughly, use the address rather than the
value. This single keyword introduces call by reference, pointers, dynamic data structures, and
other features to the language. It appears in C in the form of the operators * and &.
. A large vocabulary of PL terms, some of which have become part of the culture (cast, coercion,
narrowing, . . . .) and some of which have not (mode, weak context, voiding,. . . .). Like Algol
60, Algol 68 was not widely used, although it was popular for a while in various parts of Europe.
The ideas that Algol 68 introduced, however, have been widely imitated.
Algol 68 has a rule that requires, for an assignment x := E, the lifetime of the variable x must be
less than or equal to the lifetime of the object obtained by evaluating E.

76.Features of Pascal
Pascal was designed by Wirth (1996) as a reaction to the complexity of Algol 68, PL/I, and other
languages that were becoming popular in the late 60s. Wirth made extensive use of the ideas of
Dijkstra and Hoare (later published as (Dahl, Dijkstra, and Hoare 1972)), especially Hoares
ideas of data structuring. The important contributions of Pascal included the following.
. Pascal demonstrated that a PL could be simple yet powerful.
. The type system of Pascal was based on primitives (integer, real, bool, . . . .) and mechanisms
for building structured types (array, record, file, set, . . . .). Thus data types in Pascal form a
recursive hierarchy just as blocks do in Algol 60.
. Pascal provides no implicit type conversions other than subrange to integer and integer
to real. All other type conversions are explicit (even when no action is required) and the compiler
checks type correctness.
. Pascal was designed to match Wirths (1971) ideas of program development by stepwise refinement. Pascal is a kind of fill in the blanks language in which all programs have a similar
structure, determined by the relatively strict syntax. Programmers are expected to start with a
complete but skeletal program and flesh it out in a series of refinement steps, each of which
makes certain decisions and adds new details. The monolithic structure that this idea imposes on
programs is a drawback of Pascal because it prevents independent compilation of components.
Pascal was a failure because it was too simple. Because of the perceived missing features,
supersets were developed and, inevitably, these became incompatible. The first version of
Standard Pascal was almost useless as a practical programming language and the Revised
Standard described a usable language but appeared only after most people had lost interest in
Pascal.
Like Algol 60, Pascal missed important opportunities. The record type was a useful innovation
(although very similar to the Algol 68 struct) but allowed data only. Allowing functions in a
record declaration would have paved the way to modular and even object oriented programming.
Nevertheless, Pascal had a strong influence on many later languages. Its most important
innovations were probably the combination of simplicity, data type declarations, and static type
checking.

Page 131

Principles of Programming Languages

77.Features of Modula2
Inherits Pascals strengths and, to some extent, removes Pascals weaknesses. The important
contribution of Modula2 was, of course, the introduction of modules. (Wirths first design,
Modula, was never completed. Modula2 was the product of a sabbatical year in California,
where Wirth worked with the designers of Mesa, another early modular language.)
A module in Modula2 has an interface and an implementation. The interface provides
information about the use of the module to both the programmer and the compiler. The
implementation contains the secret information about the module. This design has the
unfortunate consequence that some information that should be secret must be put into the
interface. For example, the compiler must know the size of the object in order to declare an
instance of it. This implies that the size must be deducible from the interface which implies, in
turn, that the interface must contain the representation of the object. (The same problem appears
again in C++.)
Modula2 provides a limited escape from this dilemma: a programmer can define an opaque
type with a hidden representation. In this case, the interface contains only a pointer to the
instance and the representation can be placed in the implementation module.

78.Features of C
C is a very pragmatic PL. Ritchie (Ritchie 1996) designed it for a particular task systems
programming for which it has been widely used. The enormous success of C is partly
accidental. UNIX, after Bell released it to universities, became popular, with good reason. Since
UNIX depended heavily on C, the spread of UNIX inevitably led to the spread of C.
C is based on a small number of primitive concepts. For example, arrays are defined in terms of
pointers and pointer arithmetic. This is both the strength and weakness of C. The number of
concepts is small, but C does not provide real support for arrays, strings, or boolean operations.
C is a low-level language by comparison with the other PLs discussed in this section. It is
designed to be easy to compile and to produce efficient object code. The compiler is assumed to
be rather unsophisticated (a reasonable assumption for a compiler running on a PDP/11 in the
late sixties) and in need of hints such as register. C is notable for its concise syntax. Some
syntactic features are inherited from Algol 68 (for example, += and other assignment operators)
and others are unique to C and C++ (for example, postfix and prefix ++ and --).

79.Features of Ada
Ada (Whitaker 1996) represents the last major effort in procedural language design. It is a large
and complex language that combines then-known programming features with little attempt at
consolidation. It was the first widely-used language to provide full support for concurrency, with
interactions checked by the compiler, but this aspect of the language proved hard to implement.
Ada provides templates for procedures, record types, generic packages, and task types. The
corresponding objects are: blocks and records (representable in the language); and packages and
tasks (not representable in the language). It is not clear why four distinct mechanisms are
required (Gelernter and Jagannathan 1990). The syntactic differences suggest that the designers
did not look for similarities between these constructs. A procedure definition looks like this:
Page 132

Principles of Programming Languages

procedure procname ( parameters ) is


body
A record type looks like this:
type recordtype ( parameters ) is
body
The parameters of a record type are optional. If present, they have a different form than the
parameters of procedures.
A generic package looks like this:
generic ( parameters ) package packagename is
package description
The parameters can be types or values. For example, the template
generic
max: integer;
type element is private;
package Stack is
....
might be instantiated by a declaration such aspackage intStack is new Stack(20, integer)
Finally, a task template looks like this (no parameters are allowed):
task type templatename is
task description
Of course, programmers hardly notice syntactic differences of this kind: they learn the correct
incantation and recite it without thinking. But it is disturbing that the language designers
apparently did not consider passible relationships between these four kinds of declaration.
Changing the syntax would be a minor improvement, but uncovering deep semantic similarities
might have a significant impact on the language as a whole, just as the identity declaration of
Algol 68 suggested new and interesting possibilities.

80.Features of SML
SML (Milner, Tofte, and Harper 1990; Milner and Tofte 1991) was designed as a
metalanguage (ML) for reasoning about programs as part of the Edinburgh Logic for
Computable Functions (LCF) project. The language survived after the rest of the project was
abandoned and became standard ML, or SML. In the following example, the programmer
defines the factorial function and SML responds with its type. SML assigns the result to the
variable it, which can be used in the next interaction if desired. SML is run interactively, and
prompts with -.
- fun fac n = if n = 0 then 1 else n * fac(n-1);
val fac = fn : int -> int
- fac 6;
Page 133

Principles of Programming Languages

val it = 720 : int


SML also allows function declaration by cases, as in the following alternative declaration of the
factorial function:
- fun fac 0 = 1
= | fac n = n * fac(n-1);
val fac = fn : int -> int
- fac 6;
val it = 720 : int
Since SML recognizes that the first line of this declaration is incomplete, it changes the prompt
to
= on the second line. The vertical bar | indicates that we are declaring another case of the
declaration.
Each case of a declaration by cases includes a pattern. In the declaration of fac, there are
two patterns. The first, 0, is a constant pattern, and matches only itself. The second, \tt n.
EX: Finding factors
- fun hasfactor f n = n mod f = 0;
val hasfactor fn : int -> int -> bool
- hasfactor 3 9;
val it = true : bool
is a variable pattern, and matches any value of the appropriate type. Note that the definition
fun sq x = x * x; would fail because SML cannot decide whether the type of x is int or real.
- fun sq x:real = x * x;
val sq = fn : real -> real
- sq 17.0;
val it = 289.0 : real

81.Features of Haskell
Haskell is a functional programming language . Although it has many of the features that we
associate with other languages, such as variables, expressions, and data structures, its foundation
is the mathematical concept of a function.
Haskell computes with integers and floating-point numbers. Integers have unlimited size and
conversion are performed as required.
Prelude> 1234567890987654321 / 1234567
1.0e+012 :: Double
Prelude> 1234567890987654321 div 1234567
1000000721700 :: Integer
Strings are built into Haskell. A string literal is enclosed in double quotes, as in C++.
The basic data structure of almost all functional programming languages, including Haskell, is
the list. In Haskell, lists are enclosed in square brackets with items separated by commas. The
concatenation operator for lists is ++. This is the same as the concatenation operator for strings
Haskell considers a string to be simply a list of characters.
Prelude> [1,2,3] ++ [4,5,6]
[1,2,3,4,5,6] :: [Integer]

Page 134

Principles of Programming Languages

The Object Oriented Paradigm


Object Oriented Programming (OOP) is the currently dominant programming paradigm. For
many people today, programming means object oriented programming

82.Features of Smalltalk
The first version of Smalltalk was implemented by Dan Ingalls in 1972, using BASIC (!) as the
implementation language (Kay 1996, page 533). Smalltalk was inspired by Simula and LISP; it
was based on six principles (Kay 1996, page 534).
1. Everything is an object.
2. Objects communicate by sending and receiving messages (in terms of objects).
3. Objects have their own memory (in terms of objects).
4. Every object is an instance of a class (which must be an object).
5. The class holds the shared behaviour for its instances (in the form of objects in a program
list).
6. To [evaluate] a program list, control is passed to the first object and the remainder is treated
as its message.
Smalltalk was also strongly influenced by Simula. However, it differs from Simula in several
ways:
. Simula distinguishes primitive types, such as integer and real, from class types. In Smalltalk,
everything is an object.
. In particular, classes are objects in Smalltalk. To create a new instance of a class, you send a
message to it. Since a class object must belong to a class, Smalltalk requires metaclasses.
. Smalltalk effectively eliminates passive data. Since objects are active in the sense that they
have methods, and everything is an object, there are no data primitives.
. Smalltalk is a complete environment, not just a compiler. You can edit, compile, execute, and
debug Smalltalk programs without ever leaving the Smalltalk environment.
The block is an interesting innovation of Smalltalk. A block is a sequence of statements that
can be passed as a parameter to various control structures and behaves rather like an object.
The first practical version of Smalltalk was developed in 1976 at Xerox Palo Alto Research
Center
(PARC). The important features of Smalltalk are:
. everything is an object;
. an object has private data and public functions;
. objects collaborate by exchanging messages;
. every object is a member of a class;
. there is an inheritance hierarchy (actually a tree) with the class Object as its root;
. all classes inherit directly or indirectly from the class Object;
. blocks;
. coroutines;
. garbage collection.

Page 135

Principles of Programming Languages

83.Features of C++
C++ was developed at Bell Labs by Bjarne Stroustrup (1994). The task assigned to Stroustrup
was
to develop a new systems PL that would replace C. C++ is the result of two major design
decisions:
first, Stroustrup decided that the new language would be adopted only if it was compatible with
C; second, Stroustrups experience of completing a Ph.D. at Cambridge University using Simula
convinced him that object orientation was the correct approach but that efficiency was essential
for
acceptance.
. It is almost a superset of C (that is, there are only a few C constructs that are not accepted by
a C++ compiler);
. is a hybrid language (i.e., a language that supports both imperative and OO programming),
not a pure OO language;
. emphasizes the stack rather than the heap, although both stack and heap allocation is provided;
. provides multiple inheritance, genericity (in the form of templates), and exception handling;
. does not provide garbage collection.

84.Features of Java
Java (Arnold and Gosling 1998) is an OOPL introduced by Sun Microsystems. Its syntax bears
some relationship to that of C++, but Java is simpler in many ways than C++. Key features of
Java include the following.
. Java is compiled to byte codes that are interpreted. Since any computer that has a Java byte
code interpreter can execute Java programs, Java is highly portable.
. The portability of Java is exploited in network programming: Java bytes can be transmitted
across a network and executed by any processor with an interpreter.
. Java offers security. The byte codes are checked by the interpreter and have limited
functionality.
Consequently, Java byte codes do not have the potential to penetrate system security in the way
that a binary executable (or even a MS-Word macro) can.
. Java has a class hierarchy with class Object at the root and provides single inheritance of
classes.
. In addition to classes, Java provides interfaces with multiple inheritance.
. Java has an exception handling mechanism.
. Java provides concurrency in the form of threads.
. Primitive values, such as int, are not objects in Java. However, Java provides wrapper
classes, such as Integer, for each primitive type.
. A variable name in Java is a reference to an object.
. Java provides garbage collection.
Interfaces: A Java class, as usual in OOP, is a compile-time entity whose run-time instances are
objects. An interface declaration is similar to a class declaration, but introduces only abstract
methods and constants. Thus an interface has no instances. A class may implement an interface
by providing definitions for each of the methods declared in the interface. A Java class can
Page 136

Principles of Programming Languages

inherit from at most one parent class but it may inherit from several interfaces provided, of
course, that it implements all of them. Consequently, the class hierarchy is a rooted tree but the
interface hierarchy is a directed acyclic graph. Interfaces provide a way of describing and
factoring the behaviour of classes.
EX: Interfaces
interface Comparable
{
Boolean equalTo (Comparable other);
// Return True if this and other objects are equal.
}interface Ordered extends Comparable
{
Boolean lessThan (Ordered other);
// Return True if this object precedes other object in the ordering.
}
The portability of Java has been a significant factor in the rapid spread of its popularity,
particularly
for Web programming.
Exception Handling: The principal components of exception handling in Java are as follows:
- There is a class hierarchy rooted at class Exception. Instances of a subclass of class Exception
are used to convey information about what has gone wrong. We will call such instances
exceptions.
-The throw statement is used to signal the fact that an exception has occurred. The argument
of throw is an exception.
EX:Declaring a Java exception class
public class DivideByZero extends Exception
{
DivideByZero()
{
super("Division by zero");
}
}
Concurrency: Java provides concurrency in the form of threads. Threads are defined by a class
in that standard library. Java threads have been criticized on (at least) two grounds.
Throwing exceptions:
public void bigCalculation ()
throws DivideByZero, NegativeSquareRoot
{
....
if (. . . .) throw new DivideByZero();
....
Page 137

Principles of Programming Languages

if (. . . .) throw new NegativeSquareRoot(x);


....
}
Using the exceptions:
{ try
{ bigCalculation()
} catch (DivideByZero)
{ System.out.println("Oops! divided by zero.");
}catch (NegativeSquareRoot n)
{ System.out.println("Argument for square root was " + n);
}
}
}

THINGS TO REMEMBER:
Late binding. A binding that is performed at a later time than would normally be expected. For
example, procedural languages bind functions to names during compilation, but OOPLs late
bind function names during execution.
Overloading. Using the same name to denote more than one entity (the entities are usually
functions).
Polymorphism. Using one name to denote several distinct objects. (Literally many shapes).
Recursion. A process or object that is defined partly in terms of itself without circularity.
Reference semantics. Variable names denote objects. Used by all functional and logic
languages,
and most OOPLs.
Scope. The region of text in which a variable is accessible. It is a static property, as opposed to
extent.
Semantics. A system that assigns meaning to programs.
Stack. A region for storing variables in which the most recently allocated unit is the first one to
be deallocated (last-in, first out, LIFO).
Static typing. Type checking performed during compilation.
Strong typing. Detect and report all type errors, as opposed to weak typing.
Page 138

Principles of Programming Languages

Thunk. A function with no parameters used to implement the call by name mechanism of Algol
60 and a few other languages.
Value semantics. Variable names denote locations. Most imperative languages use value
semantics.
Weak typing. Detect and report some type errors, but allow some possible type errors to remain
in the program.

Page 139

Potrebbero piacerti anche