Sei sulla pagina 1di 18

A SHORTISH OVERVIEW OF THE

ANSI C PROGRAMMING LANGUAGE

IDENTIFIERS
Identifiers are names of variables, functions, defined types, structures and unions, enumeration
constants, statement labels and preprocessor macros
Identifiers are composed of alphanumeric characters, and _. Must start with a letter, and are case
sensitive. ANSI C requires at least 31 characters to be significant. By convention, constants are
in CAPITALS. External identifiers may be restricted to 6 significant characters only, and may be
case insensitive

DECLARATIONS
All identifiers must be declared before they are used.
Scope: region of a program over which that declaration is active
functions have scope of the entire (multi-file) program. i.e. they are extern by default.
identifiers declared within a function or block have block scope, or local scope
statement labels have function scope
all other identifiers have file scope
Visibility: the declaration of an identifier is visible at some point in a program if the use of that
identifier at that point causes it to be associated with that declaration. An identifier is usually
visible throughout its scope, however it may be hidden if a second identifier of the same name
but a more restricted scope is declared.
Extent (or Lifetime): variables and functions have existence at run-time: they have storage
allocated to them. The extent, or lifetime, of an object is the is the period of time for which
storage is allocated. It has:
static extent if storage is allocated when the program begins execution, and storage
remains allocated until program terminates. All functions have static extent, as do
variables declared outside any function. Variables declared inside blocks (functions)
may have static extent (if they are declared static ).
local (or automatic) extent if storage is allocated on entry to a block or function, and
destroyed on exit from that block or function. If variable has an initialiser, it is reinitialised each time the block or function is entered. Formal parameters have local
extent, and variables declared at the beginning of blocks may have local extent. auto
variables have local extent.
dynamic extent if storage is explicitly allocated and de-allocated by programmer. Library
routines (e.g. malloc() and free()) are used to do this dynamic memory management.

Initial Value: variable declaration allocates storage for the variable, but does not initialise it
good practice to initialise the variable when it is declared, so it is always initialised to
something definite.
also good practice to re-assign of change a value just before the variable is used!
a static variable is initialised only once (when the program is loaded), and retains its
value even if the program is executing outside the scope of the static variable.

CONSTANTS
Integer

12 (decimal) 014 (octal - starts with zero) 0xC0 (hex) are the same integer.
no binary integer type - easiest to use hex numbers instead.

Character

char const e.g.: 'R' has the value of the ASCII code for R (52 0x34 064).
char consts can be represented as octal or hex escape sequences:
\0x34 or \064 are both equivalent to 'R'.
some non-printing ASCII constants are pre-defined.
e.g.: \n is the ASCII new line character.

String

string const e.g.: "a string". Automatically terminated by an ASCII


null character '\0'.

Floating point must contain decimal point and/or E. e.g.: 1.23 or 123E-4
Suffix of u or U denotes unsigned, l or L denotes long. e.g. 12L or

12.345L

NAMED CONSTANTS
It is extremely poor practice to use "magic numbers" in code. Use an editor to search for, and
then destroy, any "magic numbers" that appear in your code. Replace them with named constants
that are defined once, and collect all of the #defines in the same place (usually a header file).
#define
#define
#define
#define

ARRAY_LENGTH
BLOCK_SIZE
TRACK_SIZE
ERROR_MESSAGE

2500
0x100
(16*BLOCK_SIZE)
"** Error %d: %s. \n"

VARIABLE DECLARATIONS
All variables must be declared before they are used. Format of declaration is:
<type qualifier> <storage class> <type> <name1>, ..., <namen>;

Default is signed auto int if variable declared in a function. A variable declared outside a
function, including the main() function defaults to signed extern int

VARIABLE TYPES
char

char is usually 8-bit


typically used to store one (7-bit or 8-bit) ASCII character
char may be equivalent to either unsigned char or signed char, or a mixture of

both (pseudo-unsigned), depending on implementation. Beware!


signed char
signed char is usually 8-bit signed integer: -128 to +127
unsigned char
is usually an unsigned 8-bit number: 0 to 255
short

(-27 to + 27 - 1)

(0 to 28 - 1)

also called short int or signed short or signed short int


signed integer, size is architecture dependent.
15
15
on 16-bit machine is -32,768 to +32,767 (-2 to + 2 - 1)

unsigned short
also called unsigned short int
unsigned integer, size is architecture dependent.
16
on 16-bit machine is 0 to 65,535 (0 to 2 - 1)
int

also called signed int or signed


default variable type is int
signed integer, size is architecture dependent.
15
15
on 16-bit machine is -32,768 to +32,767 (-2 to + 2 - 1)

unsigned int
also called unsigned
unsigned integer, size is architecture dependent.
16
on 16-bit machine is 0 to 65,535 (0 to 2 - 1)
long

also called long int or signed long or signed long int


signed integer, size is architecture dependent.
on 16-bit machine is usually 32-bit -2,147,483,648 to +2,147,483,647

unsigned long
also called unsigned long int
signed integer, size is architecture dependent.
on 16-bit machine is 32-bit 0 to 4,294,967,295

(0 to 232 - 1)

ANSI C specification requires only that a char is at least 8 bits, a short or an int are at
least 16 bits, and a long is at least 32 bits.
The ranges of permissible values of the various integer types in a given implementation are
defined in limits.h.
float

floating point number, often 4-byte: 3.4 E 38

double

floating point number, often 8-byte: 1.7 E 308 (15 decimal digits)

( 7 decimal digits)

not ANSI C type, but C++ type only!


floating point number, 80 bits: 3.4E+4932 to -1.1E-4932 (19 decimal digits)

long double

The form of floating point types is not specified, nor are they even required to be different!
The ranges of permissible values of floating point types in a given implementation are
defined in float.h .
void

is a type used to explicitly specify an empty set, or no value.

POINTER TYPES
For any type T, a type pointer to T may be formed. A variable of type pointer to T holds the

address in memory of an object of type T.


A pointer may point to a variable or a function. Some example declarations:
char *
int
double

ptr;
ptr is a pointer to an object of type char
(* fp)(char); fp is pointer to function with char argument that returns int
(* ap)[];
ap is a pointer to an array of double

Note that some function declarations look a bit like pointer declarations...
int *
int *

tool(char);
sally[];

tool is function with char argument returning a pointer to int


sally is an array of pointers to int

A generic pointer type is defined. This generic pointer can be cast to a pointer to an object of
any type. In ANSI C, the generic pointer type is void * (read this as pointer to void).
A null pointer explicitly points to no object or function. The standard header files stddef.h
contains the definition of the null pointer, NULL. Since the value of NULL is implementation
dependent, one should test for equality to NULL, rather than to 0 or 0L or (void *) 0.
It is good practice to ensure that all pointers have the value NULL when they are not pointing to

a valid object or function.


ARRAY TYPES
<type> <name>[m][n] is an m-by-n array. For example,
int harry[2][3], sally[4];
/* harry is 2x3, sally is 1x4
*/
arrays may have any number of dimensions. e.g. int hyper[8][2][10][6][13];
elements are numbered from zero.
address of harry is the same as &harry[0][0], and also the same as harry
subscripting is done with pointer arithmetic. Thus, sally[i] is the same as *(sally + i)
array initialisation:
int
int
char

harry[2][3] = { {3, 2, 1}, {4, 27, -11} };


squares[]
= {1, 4, 9, 16, 25};
message[]
= {"Braces may be omitted"};

STORAGE CLASS SPECIFIERS


auto

static

extern

variables are local to the block (includes function) in which they are declared.
memory is allocated when variable is used, released when function terminates
i.e. passed through the stack. Efficient use of memory, but function "forgets"
between invocations. i.e. variable has local (automatic) extent.
variable retains its value between invocations, as it is allocated a fixed address in
RAM. The variable has static extent.
if declared within a function, scope is that function. i.e. "private" to the function.
if static variable declared outside all functions, scope is file in which it is
declared. i.e. global inside file, but not visible outside file - a module variable.
scope of static function is file in which declared: the name of the function is not
exported to the linker. i.e. a static function is "private" to the file.
extern functions and variables have external linkage: their names are
exported to the linker, so that they are visible outside the present file.
extern objects have static extent
variables declared outside all functions are extern by default.
local variables with the same name take precedence within their scope.
extern function is defined outside of the present file. Scope is global.

register use a CPU register to store the value, if possible, to give fast access to a variable.

TYPE QUALIFIERS
const

object is nonmodifiable. Some compilers may place the object in ROM.

volatile object can be modified by something outside the scope of the program, such as
an external hardware event. Instructs the compiler not to optimise a volatile

variable across sequence points.

DEFINED AND COMPOSITE TYPES


typedef

used to define new data types in terms of those already defined. Does not
allocate storage.
extremely useful for building portability into code!! For example,
typedef

unsigned char *

prompt_t;

defines a new data type, prompt_t to be a pointer to unsigned char. Then,


prompt_t user_prompt;

declares a new variable user_prompt of type prompt_t.

struct

a structure is a derived data type consisting of one or more named members of the
same or different data types. For example,
struct material_s
{
float density;
float modulus;
float yield_strength;
};

/* definition

*/

defines a new structure. material_s is the structure tag.


struct material_s steel;

/* declaration

*/

declares steel to be an instance of a material_s structure.


steel.modulus = 207.0E+09;

/* use

*/

selects the modulus component of steel and assigns a value to it.


component selection can also be done through a pointer to a structure:
struct material_s *
ptr_steel;
ptr_steel = &steel;
ptr_steel->modulus = 207.0E+09;

union

a union is a derived data type capable of containing, at different times, any one of
several different data types. It is like a structure big enough to contain any of the
members - only one member can be stored at any time. For example,
union mixed_u
{
char
c;
int
i;
long
l;
};
union mixed_u

x;

x.c = 27;
x.l = 934273421;

/* definition

*/

/* declaration

*/

/* use this
/* OR this !!

*/
*/

components of unions are selected in the same way as components of structures.

enum

enumerated types have values that can range only over a set of named (int)
constants called enumerators. For example.
enum state_e {false, true};
enum state_e sleeping;

/* definition
/* declaration

*/
*/

sleeping = false;

/* use

*/

Enumerated types take sequential integer values starting from zero. In the
example above, false==0 and true==1. Alternatively, can initialise values:
enum colour_e {red = 2, green = 4, blue = 8};

BITFIELDS

Structures used in machine-dependent programs that must force a data structure that
corresponds exactly to fixed hardware features.
Bit fields may be of type int, signed int or unsigned int.
"Padding" may be required to create holes.
Ordering of bit packing will depend on whether the target computer is a "big endian" or a
"little endian". Bitfields are therefore likely to be non-portable.
#define SET
#define CLEAR
typedef struct
{
unsigned int
unsigned int
unsigned int
unsigned int
unsigned int
unsigned int
unsigned int
unsigned int
unsigned int
} port_even_b;

1
0

bit0:
bit1:
bit2:
bit3:
bit4:
bit5:
bit6:
bit7:
odd_byte:

struct port_even_b *
ad_command->bit4 = SET;

1;
1;
1;
1;
1;
1;
1;
1;
8;

/* low byte

/* pad out to 16 bits

ad_command

*/

*/

= ((struct port_even_b *) 0x02);

OPERATORS
arithmetic + - * /
Integer division truncates.

e.g.: 99/100 == 0;

assignment =
Can use multiple assignment.
Short form of assignment:
is the same as
increment ++ increment
-- decrement
Pre-increment is ++b;
Post-increment is b++;
bitwise

5/3 == 1;

e.g.: a = b = c = 0;
e.g.: a += b;
a = a + b;

increment b, then use b


use b, then increment it

~ NOT
& AND
| OR

^ XOR
Bitwise operators are useful for setting and testing bits. Some examples:
#define EMPTY
#define CTS

0x04
0x20

flags |= (EMPTY | CTS);


flags &= ~(EMPTY | CTS);
flags ^= (EMPTY);
if !(flags & (EMPTY | CTS))

shift

i << j
i >> j

/*
/*
/*
/*

sets EMPTY, CTS bits


resets them
changes state of EMPTY
true if both bits set

*/
*/
*/
*/

Shift i left by j bits


Shift i right by j bits

If i is unsigned, or signed and positive, zeros are shifted in to replace bits shifted out
(logical shift). If i is signed and negative, some compilers do arithmetic shift

(preserve the sign bit), some do logical shift.


relational

> greater than


< less than
>= greater than or equal to
<= less than or equal to
== equal to
!= not equal to

(...) && !(...)


logical
connective (...) || (...)

(...) and NOT(...)


(...)
or
(...)

an expression which is TRUE will evaluate to 1


an expression which is NON-ZERO is interpreted as TRUE

useful to do
#define
#define

sizeof

FALSE
TRUE

0
1

sizeof(object) returns the size of the argument measured in memory storage units
(usually bytes). The type returned by sizeof() is size_t, defined in stddef.h.

MIXED-TYPE EXPRESSIONS
legal but potentially dangerous: compiler automatically performs type conversion so all

variables in expression are of same type before expression is evaluated. Can cause unexpected
results if you don't think like a compiler!!
preferred types are starred, below. In a mixed-type expression, non-preferred types are first
promoted to a preferred type, then all are promoted to the highest type.
double
float
long
unsigned int
int
char
automatic conversion across assignment: value on RS converted to type on LS. For example:
an_int = a_char;
a_char = an_int;
an_int = a_float;

is OK
will (probably) truncate high byte of an_int
is suspect, if a_float > 32,767

a "cast" is used to explicitly change a variable's type. The classical example is casting a
generic pointer during dynamic memory allocation. The library function malloc() returns
void * .
material_s *

ptr_zinc = (struct material_s *) malloc();

This statement only allocates memory and initialises the pointer: the memory is not
initialised.
Another example is the use of a cast in defining the type stored at an absolute address:
#define SBUF

(* (unsigned char *)(0x07))

if you have to use lots of casts, there may be something wrong with choice of variable types.

STATEMENTS
simple statement e.g.:
x = a + b;

/* Note semicolon!

*/

compound statement or sequential structure is created by using braces to group simple

statements into a block. e.g.:


{
x = a + b;
b = sin(x);

/* This is a comment.
/* Comments can NOT be nested

*/
*/

}
the statements within the braces { ... } constitute a block.

FLOW CONTROL CONSTRUCTS

In the following constructs, <statement> can be a single statement or construct, or a block of


statements or constructs. <expression> is anything which evaluates to either TRUE or
FALSE.

These constructs allow for the direct implementation of structured code.

if else construct
if (<expression>)
<statement1>;
else
<statement2>;
the else is optional

multiple if-else constructs may be used:


if (<expression1>)
<statement1>;
else if (<expression2>)
<statement2>;
else if (<expression3>)
<statement3>;
else
<statement4>;

Conditional Expression
r = <expression> ? <statement1> : <statement2>;

is equivalent to
if (<expression>)
r = <statement1>;
else
r = <statement2>;

/* if TRUE

or

!= 0

*/

switch construct
switch (<integer expression>)
{
case <integer constant1>:
<statements>;
case <integer constant2>:
<statements>;
case <integer constant3>:
<statements>;
default:
<statements>;
}
<integer constant> should be set up in a #define preprocessor directive, or as an

enumerated type. For example,


#define RESET
3
enum COMMAND_E {INIT, GO, STOP, RESET, EXIT};

/* or ...

*/

each group of <statements> will usually have to end with break; The break statement
causes execution of the smallest enclosing while, do, for or switch to terminate.
the default: block is optional, but is good practice.

while construct
while(<expression>)
{
<statements>;
}

an infinite loop is often written


while (TRUE)
{
<statements>;
}

/* Can have zero passes

*/

do while construct
do
{
<statements>;
} while (<expression>);

/* Minimum of one pass

*/

for construct
for (<loop initiation>; <loop test>; <loop action>)
{
<statements>;
}

behaves exactly the same as


<loop initiation>;
while (<loop test>)
{
<statements>;
<loop action>;
}

an infinite loop is also commonly written


for (;;)
{
/* statements ... */
}

break, continue and goto

execution of a break statement causes the transfer of control to the first statement
following the innermost enclosing while, do or for loop, or switch statement.

execution of a continue statement causes the transfer of control to the beginning of the
innermost enclosing while, do, or for loop statement. Execution of the affected loop
statement may continue following re-evaluation of the loop continuation condition test.
a continue statement has no interaction with an enclosing switch statement.

a goto statement may be used to transfer execution to any statement within a function:
/* some statements */
goto find_the_label;
/* lots of statements ... */
find_the_label:
/* more statements ... /*

Use goto with extreme care, and only when really necessary. It can lead to unreadable
"spaghetti" code. In particular, never use a goto to branch into the body of an if , a
switch, a for or a block from outside the block.

it is far better style to use break, continue and return statements in place of goto.

FUNCTIONS
Function is block of statements invoked by a single call.
Each function should be declared before it is used. This is nor required by ANSI C, but is

strongly recommended, as it allows the compiler to perform type checking. The "prototype"
form of a declaration is
<type qualifier> <storage class> <type> <name> (<argument list>);

where <argument list> is of the form


<<type qualifier> <storage class> <type> <name1>,
<type qualifier> <storage class> <type> <name2>, ...,
<type qualifier> <storage class> <type> <nameN>>

Some examples of the prototype form of function declarations:


char

func(unsigned);

long *

junk(double, float);

void

monk(float *, long);

double *

punk(void)

func is a function with one unsigned argument


that returns a char
junk is a function with 2 arguments - a double
and a float that returns a pointer to long
monk is a function with 2 arguments - a pointer
to float and a long that returns nothing
punk is a function with no arguments that returns
a pointer to double

Note that some pointer declarations look a bit like function declarations ...
int
void

(* fp)(void);
(* fp)(void);

fp is pointer to function with no arguments that returns int


fp is pointer to function with no arguments that returns nothing

Each function must be defined (see below) before it is invoked.


Format of function invocation (call) is
<function name> (<arguments>);
Parenthesis must be present even if the function has no arguments.
Arguments are passed by value. That is, copies of the arguments are made, and the copies are

transferred to the function. See "call by reference", following.


When a function appears in an expression, its value is the value returned by the function. Note

that multiple calls of the same function within an expression are not guaranteed!

The default return type of a function is extern signed int.


A function with the return type void must not have a return statement.

Function definition
Consists of declarations and statements that make up the function.
Format of a function definition is:
<type qualifier> <storage class> <type> <name> (<argument list>)
{
<local variable declarations>;
<statements>;
return (<expression>);
/* omit if function returns void
}

*/

where, for the prototype form of a definition, <argument list> is of the form
<<type qualifier> <storage class> <type> <name1>,
<type qualifier> <storage class> <type> <name2>, ...,
<type qualifier> <storage class> <type> <nameN>>
Function definitions may not be nested.
A function that can have a second copy of itself invoked before the first copy has terminated is

called reentrant. A recursive function must also be reentrant. A function that uses only
auto variables will be reentrant. Any functions invoked by an interrupt must be reentrant.
Note that not all compilers produce reentrant code!

POINTERS [Simple Usage See also miscellaneous examples given previously]


A pointer is a variable which stores the address of another variable. If the pointer variable
ptr contains the address of another variable var, then ptr is said to point to var.

Pointer declaration
Pointers are declared as
<type> *

<name>;

/* the pointer type is

<type> *

*/

for example:
unsigned char *

ptr;

/* ptr points to (is the address of) an


/* unsigned char. *ptr is the value of
/* the unsigned char pointed to by ptr

*/
*/
*/

Pointer initialisation
ptr = &var;

/* ptr assigned the address of var

*/

ptr = "string const";

/* compiler allocates storage for string


/* ptr assigned address of first character

*/
*/

/* declare ptr a pointer to unsigned char .


/* *ptr is the unsigned char variable,
/* which is not yet initialised.

*/
*/
*/

Pointer indirection
unsigned char *

unsigned char
unsigned char *
c = 55;
ptr = &c;
d = *ptr;
++*ptr;

ptr;

c, d;
ptr;
/* ptr is the address of c
/* d = value of variable pointed to by ptr
/* Same as ++c

*/
*/
*/

Common usage of pointers is when a function must change values of arguments ("call by

reference"). For example:


void swap(int *, int

int

first

*);

/* prototype function declaration

= 23,

*/

/* declare and initialise variables

*/

/* invocation - will swap values

*/

/* prototype style function definition

*/

second = 44;
swap (&first, &second);
void swap(int*
{
int temp;
temp = *a;
*a = *b;
*b = temp;
}

a, int*

b)

Pointer (i.e. address) arithmetic


Valid operations with pointers are
add integer to pointer
subtract integer from pointer
increment/decrement pointer
compare two pointers using relational operators
subtract two pointers which point to elements of the same array
Pointer operations are automatically scaled to account for the differing sizes of variables of

different type.

The type of the result of subtracting one pointer from another is architecture dependent. It is
of type ptrdiff_t, defined in stddef.h.

Pointer arithmetic with arrays


int a[10];

/* Array of 10 ints;

a is a pointer to a[0]

*/

So, *(a+2) is the same as a[2]


When an array is passed to a function, only the pointer to the array is actually passed. Thus,

arrays are always passed by address, and the function can alter the values of array elements.

DYNAMIC MEMORY MANAGEMENT


the storage allocation facility provides examples of standard library functions which allow the

program to allocate a region of memory from the "heap" (unallocated RAM), and deallocate it
when no longer required.
memory allocation functions return a generic pointer (void *). The user may then use a cast
to convert the pointer to another pointer type.
malloc() alocates storage for one object of size size, as determined by sizeof(size). The
memory is not initialised. If the memory cannot be allocated, a NULL pointer is returned. For
example, a function to allocate (memory for) a new object of type object_t :
object_t *
{
object_t *

new_object_t(void)
objectptr;

objectptr = (object_t *) malloc(sizeof(object_t));


if (objectptr == NULL)
{
printf("new_object: out of memory!\n");
}
return (objectptr);
}
free() deallocates memory previously allocated by malloc(), calloc() or realloc().

once memory has been deallocated, it is an error to use this deallocated memory.
realloc() can be used to increase or decrease the size of a region of previously allocated
memory. If the size is increased, any information previously stored may be retained (if
realloc() returns the same pointer). If realloc() is called to reallocate a region to size
zero, the action is the same as free().

PREPROCESSOR DIRECTIVES
The preprocessor operates on the source file before it is parsed by the compiler. Preprocessor
directives tell the preprocessor what to do. Common directives include:
#define
defines an identifier to be equal to a token sequence. For example:
#define PROMPT
#define PIO_C_A

'>'
0x0021

/* defined constants */

#define IO_PORT3

(* (unsigned char *) (0x1FFE))

#define stop_m()
#define square(x)

(outp(PIO_C_A, HALT);)
((x)*(x))

/* contents of
absolute address */
/* macro definition */
/* brackets for correct
macro expansion! */

you should have NO numerals ("magic numbers") in source code use a text editor to search
for them. Use #defined constants and collect them together in a header file.

#ifdef
can test to see if a macro is defined (i.e. non-zero). Useful for debugging instrumentation:
#ifdef DEBUG
/* debugging statements ... */
#endif

#include
causes the entire contents of a file to be included in place of the include statement. For

example:
# include <filename>
# include "filename"

/* system file
/* user's file

to ensure that a header file is included only once, do


#if !defined(HEADER_NAME)
#define HEADER_NAME
...
#endif

/* body of header file */


/* !defined(HEADER_NAME) */

*/
*/

That's all for a shortish introduction. Two excellent general references on the C language are
Kernighan, B.W. and Ritchie, D.M. The C Programming Language. 2ed, Prentice Hall,
Englewood Cliffs NJ, c. 1990. Make sure to get the 2nd (ANSI) edition! Dennis Ritchie
wrote the C language.
Harbison, S.P. and Steele, G.L. C: A Reference Manual. Prentice Hall, Englewood Cliffs NJ,
1991.
A nice book on program construction is
McConnell, S.J. Code Complete. Microsoft Press, 1993.

Thanks to Chris Bujor for reviewing this document, and for his many helpful suggestions. Any
errors or omissions remaining are, of course, my responsibility.
DCR
15 July, 1996.

File: C.DOC

Created: DCR, 23/4/94

Saved: 15/07/1996 12:12 PM

Printed: 29/04/2002 2:16 PM

Potrebbero piacerti anche