Sei sulla pagina 1di 39

Unit IV

Fundamental File Processing


Operations
Chapter Outline

1. Physical Files and Logical Files


2. Opening Files
2.2 Closing Files
4. Reading and Writing
5. Seeking
6. Special Characters in Files
7. The Unix Directory Structures
8. Physical Devices and Logical
Files
9. File-Related Header Files
10. Unix File System Commands
2.1 Physical Files and Logical Files

© File in Unix
◆a particular collection of bytes or sequence of bytes
© Physical file
◆a file on a disk or tape

© Logical file
◆a file used inside the program

select inp_file assign to “myfile.dat”. : Cobol

assign (inp_file, 'myfile.dat') : Turbo Pascal

where logical file is inp_file and physical file is myfile.dat


Dr K. Srinivas 3
Physical Vs Logical File
 Files serve as the connecting link between a program and the device
used for I/O.
 Each file on the system has an associated file description which describes
the file characteristics and how the data associated with the file is
organized into records and fields.

 Physical File: A collection of bytes stored on a disk or tape.

 Logical File: A “Channel” (like a telephone line) that hides the details of
the file’s location and physical format to the program.
 When a program wants to use a particular file, “data”, the operating

system must find the physical file called “data” and make the hookup
by assigning a logical file to it.
 This logical file has a logical name which is what is used inside the

program.
2.1 Physical Files and Logical Files

Dr K.Srinivas
Adv Data Structure
5
2.2 Opening Files
Two options
(1)open an existing file
position at the beginning of the file and ready to start reading
and writing
(2)create a new file
ready for use after creation
We can open an existing file or create a new one in C
through the
UNIX system function open( ) .
This function takes two required arguments and a third
argument
that is optional:
fdDr K.=Srinivas
open(filename, flags [, pmode]); 6
2.2 Opening Files

Dr K.Srinivas
Big Data Analytics 7
2.2 Opening Files

fd = open(filename, O_RDWR | O_CREAT, 0751);


The following function call opens an existing file for reading and writing, or
creates a new one if necessary. If the file exists it is opened without
change;
fd = open(filename, O_RDWR | O_CREAT | O_TRUNC, 0751);
The following call creates a new file for reading and writing. If there is
already a file with the name specified filename, its contents are truncated.
fd = open(filename, O_RDWR | O_CREAT | O_EXCL, 0751);
Finally, here is a call that will create a new file only if there is not already a
file with the name specified in filename.
If a file with this name exists, it is not opened and the function returns a
negative value to indicate an error.
Dr K.Srinivas
Big Data Analytics 8
2.3 Closing Files

Makes the logical file name available for another physical


file (it’s like hanging up the telephone after a call).

Ensures that everything has been written to the file [since


data is written to a buffer prior to the file].

Files are usually closed automatically by the operating


system (unless the program is abnormally interrupted).

Dr K. Srinivas 9
2.4 Reading and Writing (1/4)
Read and Write Functions
Before reading or writing, we must have already opened
the file.
Low level read or write:
 Read (Source_file, Destination_addr, Size)

 Source_file = location the program reads from, i.e., its

logical file name


 Destination_addr = first address of the memory block

where we want to store the data.


 Size = how much information is being brought in from the

file (byte count).


 Write (Destination_file, Source_addr, Size)

Dr K.Srinivas
Adv data structures 10
2.4 Files with-C Streams & C++ Stream
Classes(2/4)

stream : a file or some other source or consumer of file


(1) C Streams or C input/output
◆ use the standard C functions in stdio.h
◆ stdio.h contains definitions of the types & the operations on C streams
◆ stdin & stdout : standard input and output streams
◆ file = fopen(filename, type);
◆ fread, fget, fwrite, fput, fscanf, fprintf
(2) C++ stream classes
◆ use the stream classes of iostream.h and fstream.h
◆ cin, cout : predefined stream objects for the standard input & standard
output files
◆ fstream : class for access to files has two constructors and methods,
open, read, write
◆ >>(extraction) and <<(insertion) : overloaded for input and output

Dr K. Srinivas 11
2.4 Programs in C++ to Display the
Contents of a File(3/4)
This first simple file processing program, which we call LIST, opens a file
for input and reads it, character by character, sending each character to
the screen after it is read from the file.

LIST includes the following steps:

1.Display a prompt for the name of the input file


2.Read the user’s response from the keyboard into a variable called filename
3.Open the file for input
4.While there are still characters to be read from the input file
1. Read a character from the file
2. Write the character to the terminal screen
5.Close the input file

Dr K. Srinivas 12
2.4 Programs in C++ to Display
the Contents of a File(3/4)

Dr K.Srinivas
Big Data Analytics 13
2.4 Detecting End-of-file(4/4)
© Detecting End-of-file
◆ C
The read( ) call returns the number of bytes read.
If read( ) returns a value of zero, then the program
has reached the end of the file.
So, rather than using an eof( ) function, we construct
the while loop to run as long as the read() call finds
something to read.
fread call returns the 0 of elements read

◆ C++
use the function fail to check end-of-file
Dr K. Srinivas 14
2.5 Seeking
 Read the file sequentially, reading one byte after another until we reach the end of
the file.

 Every time a byte is read, the operating system moves the read/write pointer
ahead, and we are ready to read the next byte

 Sometimes we want to read or write without taking the time to go through every
byte sequentially.

 Perhaps we know that the next piece of information we need is 10,000 bytes away,
and so we want to jump there to begin reading.

 Or perhaps we need to jump to the end of the file so we can add new information
there.

 To satisfy these needs we must be able to control the movement of the read/write
pointer Dr K.Srinivas
Adv data structures 15
2.5 Seeking (1/3)

The action of moving directly to a certain position in a file is often called seeking.
 A seek requires at least two pieces of information, expressed here as arguments
to the generic pseudocode function SEEK( ) :
© Seeking
Seek (Source_file, Offset)
Source_file : logical file name
Offset : the # of positions from the start of the file
◆ (ex) Seek(data, 373)
move directly from the origin to the 373 position

Dr K. Srinivas 16
2.5 Seeking with C Streams(2/3)
One of the features of UNIX that has been incorporated into C streams is the
ability to view a file as a potentially very large array of bytes that just happens to
be kept on secondary storage.

The C stream seek function, fseek, provides a similar capacity for files

The fseek function has the following form:


pos = fseek(fd, byte-offset, origin)
Byte-offset: The number of bytes to move from some origin in
the file.
◆ Origin:
◆ 0-fseek( ) from the beginning of the file;
◆ 1-fseek( ) from the current position;
◆ 2-fseek( ) from the end of the file.
pos = fseek (fd, 373, 0);
Dr K. Srinivas 17
2.5 Seeking with C++ Stream
Classes (3/3)
◆Almost exactly the same as in C streams
◆Two syntactic differences
(1)an object of fstream has two file pointers, get pointer for input and put
pointer for output
=>seekg for the get pointer and seekp for the put pointer

(2)seek operations are methods of the stream classes


=>file.seekg(byte_offset, origin)
file.seekp(byte_offset, origin)
where origin = ios::beg, ios::cur, and ios::end

file.seekg(373, ios::beg); file.seekp(373, ios::beg);

Dr K. Srinivas 18
2.6 Special Characters in Files
 Sometimes, the operting system attempts to make “regular”
user’s life easier by automatically adding or deleting
characters for them.
 These modifications, however, make the life of programmers
building sophisticated file structures
 Control-Z is added at the end of all files (MS-DOS). This is to
signal an end-of-file.
 <Carriage-Return> + <Line-Feed> are added to the end of
each line (again, MS-DOS).
 <Carriage-Return> is removed and replaced by a character
count on each line of text (VMS)

Dr K.Srinivas
Adv data structures 19
2.7 The Unix Directory Structure
 In any computer systems, there are many files (100’s or
1000’s). These files need to be organized using some
method. In Unix, this is called the File System.

 The Unix File System is a tree-structured organization of


directories. With the root of the tree represented by the
character “/”.

 Each directory can contain regular files or other directories.

 The file name stored in a Unix directory corresponds to its


physical name.
Dr K.Srinivas
Adv data structures 20
2.7 The Unix Directory Structure
 Any file can be uniquely identified by giving it its absolute
pathname. E.g., /usr6/mydir/addr.

 The directory you are in is called your current directory.

 You can refer to a file by the path relative to the current


directory.

 “.” stands for the current directory and “..” stands for the
parent directory.

Dr K.Srinivas
Adv data structures 21
2.7 The UNIX Directory Structure
© UNIX file system
◆a tree-structured organization with two kinds of
files (i.e., regular files(programs and data) and directories)
◆devices such as tape or disk drivers are
also files (in dev directory)
◆“/”
◆ to indicate the root directory
to separate directory names from the file name

◆absolute pathname and relative pathname for file identification
◆ current directory : .
◆ parent directory : ..

Dr K. Srinivas 22
2.7 The UNIX Directory Structure

/
(root)

bi us usr6 dev
n r

bi lib mydir
consol kbd
adb cc n li e TAP
yacc b
libdf. E
a add D
libc. F
r
a libm.a

Dr K. Srinivas 23
2.8 Physical Devices and Logical Files

Physical Devices as Files


file in UNIX
a sequence of bytes ( => very few operations )

magnetic disk and devices like the


keyboard and the console are also files
(/dev/kbd, /dev/console)

represented logically by an integer (file descriptor)

Dr K. Srinivas 24
2.8 The Console, the Keyboard, and
Standard Error

© The Console, the Keyboard, and Standard Error


◆ defined in stdio.h

◆ Stdin (standard input) : keyboard


◆ Stdout(standard output): console
◆ Stderr(standard error) : console
◆ Read and write
read ... gets <--- stdin
write ... printf ---> stdout

Dr K. Srinivas 25
Stdout, Stdin, Stderr
 Stdout --> Console

 fwrite(&ch, 1, 1, stdout);

 Stdin --> Keyboard

 fread(&ch, 1, 1, stdin);

 Stderr --> Standard Error (again, Console)

 [When the compiler detects an error, the error message is


written in this file]
Dr K.Srinivas
Adv data structures 26
2.8 I/O Redirection and Pipes

I/O Redirection and Pipes


For switching betweenstandard I/O (stdin and stdout) and regular file I/O
I/O redirection
to specify at execution time alternate files for input or
output
< file ( redirect stdin to "file" )
> file ( redirect stdout to "file" )
(ex) list > myfile
Pipe
program1 | program2 [take any stdout output from program1 and use
it in place of any stdin input to program2.
E.g., list | sort

Dr K. Srinivas 27
2.9 File-Related Header
Files
© Header files ( /usr/include )
◆ have special names and values
◆ C streams : stdio.h
◆ C++ streams : iostream.h and fstream.h
◆ Unix operations : fcntl.h and file.h
◆ EOF, stdin, stdout, stderr : stdio.h
◆ O_RDONLY, O_WRONLY, O_RDWR : file.h

Dr K. Srinivas 28
2.10 Unix File System Commands
cat filenames --> Print the content of the named textfiles.

tail filename --> Print the last 10 lines of the text file.

cp file1 file2 --> Copy file1 to file2.

mv file1 file2 --> Move (rename) file1 to file2.

rm filenames --> Remove (delete) the named files.

chmod mode filename --> Change the protection mode on the named file.

ls --> List the contents of the directory.

mkdir name --> Create a directory with the given name.


Dr K. Srinivas 29
rmdir name --> Remove the named directory.
A.1 File I/O in Pascal
(1/2)
© included in language definition
© provide high-level access to reading/writing
© in C, a file is a sequence of bytes, but
in Pascal, a file is a sequence of “records”

Dr K. Srinivas 30
A.2 File I/O in Pascal
(2/2)
© File I/O functions
◆ assign(input_file, ‘myfile.dat’);
// associate between a logical file and a physical file
◆ reset(input_file); // open existing file
◆ rewrite(input_file); // create new file
◆ append(input_file); // open to add data to existing file
◆ read(input_file, var); // read from file to variable
◆ readln(input_file, var); // read from file to variable
◆ write(input_file, var); // write from variable to file
◆ writeln(input_file, // write from variable to file
◆ var);
close(input_file); // close file

Dr K. Srinivas 31
A.3 File I/O
in C
© Low-level I/O
© UNIX system calls
◆ fd1 = open(filename1, rwmode);
◆ fd2 = open(filename2, rwmode);
◆ read(fd1, buf, n);
◆ write(fd2, buf, n);
◆ lseek(fd1, offset, origin);
◆ close(fd1);
◆ close(fd2);

Dr K. Srinivas 32
A.4 <stdio.h>
© fp = fopen(s, mode)
/* open file s; mode “r”, “w”, “a” for read, write, append (returns NULL for
© c = error) */ /* get character; getchar() is getc(stdin) */
© getc(fp) /* put character; putchar(c) is putc(c,
© ungetc(c,
putc(c, fp) stdout) */
fp) /* put character back on input file fp; at most 1 char can be pushed back at one
© scanf(fmt,
time */ a1,
....) /* read characters from stdin into a1, ... according to fmt. Each ai must be a pointer. Returns
EOF or number of fields converted */
© fscanf(fp, .....) /* read from file fp
© printf(fmt, a1,
*/ /* format a1, ... according to fmt, print on
© ....) stdout */
© fgets(s,
fprintf(fp,n, ....) /* print .... on file fp */
fp) /* read at most n characters into s from fp. Returns NULL at end of
© fputs(s,
file fp)
*/ /* print string s on file fp */
© fflush(fp) /* flush any buffered output on file
fp */
© fclose(fp) /* close file fp */

Dr K. Srinivas 33
A.5 File I/O in C++

© #include <fstream.h>
© File Stream: fstream, ifstream, ofstream
(ex) ifstream f1(“input.fil”);
ofstream f2(“output.fil”, ios::out|ios::nocreat); fstream
f3(“inout.fil”, ios::in|ios::out); f1.get(ch);
f1.eof();
f2.put(ch);
f2.bad();
f1.seekg(); f2.seekp(); f3.close();

Dr K. Srinivas 34
A.6 <iostream.h> (1/3)

class
ios

class istream: virtual public class ostream: virtual public


ios ios

class iostream: public istream, public


ostream

Class Hierarchy
Dr K. Srinivas 35
A.7 <iostream.h> (2/3)
class ostream: virtual public ios class istream: virtual public ios
{ public: { public:
istream& get(char*, int, char = ‘₩n’);
ostream& put(char); ostream& istream& get(char); istream&
write(char*, int); ostream& read(char*, int);
seekp(int); ostream& istream& gets(char**, char = ‘₩n’);
operator<<(char); ostream& istream& seekg(int); istream&
operator<<(int); ostream& operator>>(char&); istream&
operator<<(char*); ostream& operator>>(int&); istream&
operator<<(long); ostream& operator>>(char*); istream&
operator<<(short); ostream& operator>>(long&);
operator<<(float); ...............
.............. }
}; ;

Dr K. Srinivas 36
A.8 <iostream.h>
(3/3)
class iostream: public
istream, public ostream
{
public:
iostream( ) { }
};

Dr K. Srinivas 37
A.9 Copy Program in C++
(1/2)
#include <fstream.h> #include <libc.h>

void error(char *s, char *s2 = ““){ cerr << s << ‘ ‘


<< s2 << ‘₩n’; exit(1);
}

int main(int argc, char *argv[])


{
if( argc != 3) error(“wrong number of arguments”);

ifstream src(argv[1]); //input file stream


if (!src) error(“cannot open input file”, argv[1]);

Dr K. Srinivas 38
A.9 Copy Program in C++
(2/2)
ofstream dest(argv[2]); //output file stream
if(!dest) error(“cannot open output file”,
argv[2]);

char ch;
while( src.get(ch) ) dest.put(ch);

if(!src.eof() || dest.bad())
error(“something strange happened !”);

return 0;
}

Dr K. Srinivas 39

Potrebbero piacerti anche