Sei sulla pagina 1di 17

Files

Chapter 7
Opening a File
• Before we can read the contents of the file, we must tell Python
which file we are going to work with and what we will be doing
with the file

• This is done with the open() function

• open() returns a “file handle” - a variable used to perform


operations on the file

• Similar to “File -> Open” in a Word Processor


Using open()

• handle = open(filename, mode)

• returns a handle use to manipulate the file

• filename is a string

• mode is optional and should be 'r' if we are planning to


read the file and 'w' if we are going to write to the file
When Files are Missing

>>> fhand = open('stuff.txt')


Traceback (most recent call last):
File "<stdin>", line 1, in <module>
FileNotFoundError: [Errno 2] No such file or directory: 'stuff.txt'
The newline Character
• We use a special character
called the “newline” to indicate
when a line ends
>>> stuff = 'X\nY'
• We represent it as \n in strings >>> print(stuff)
X
Y
• Newline is still one character - >>> len(stuff)
not two 3

• A text file has a newline at the


end of each line
File Handle as a Sequence
• A file handle open for read can
be treated as a sequence of
strings where each line in the #printing the whole file
file is a string in the sequence xfile = open('mbox.txt')
for cheese in xfile:
• We can use the for statement print(cheese)
to iterate through a sequence

• Remember - a sequence is an
ordered set
Reading from a file

• Python is efficient to read large files


#Counting the number of lines
• We use a for loop to read each line
fhand = open('mbox-short.txt')
at a time (which is detected by the count = 0
newline character) for line in fhand:
• Little space memory is used since a count = count + 1
line is read, counted, then discarded print('Line Count:', count)
Small file

• You can also read a whole file #Counting the number of symbols
into one string using the read() fhand = open('mbox-short.txt')
inp=fhand.read()
function
print(len(inp))
• In this example the string inp
holds the entire file (newlines #Or we could print some symbols
and all) into a single string print(inp[:10])#first 10
Searching Through a File

We can put an if statement in fhand = open('mbox-short.txt')


for line in fhand:
our for loop to only print lines
if line.startswith('From:') :
that meet some criteria print(line)
OOPS!
What are all these blank From: stephen.marquard@uct.ac.za\n
lines doing here? \n
From: louis@media.berkeley.edu\n
• Each line from the file \n
has a newline at the end From: zqian@umich.edu\n
\n
• The print statement adds From: rjlowe@iupui.edu\n
a newline to each line \n
...
Searching Through a File (fixed)
fhand = open('mbox-short.txt')
• We can strip the whitespace for line in fhand:
from the right-hand side of line = line.rstrip()
if line.startswith('From:') :
the string using rstrip() from print(line)
the string library
From: stephen.marquard@uct.ac.za
• The newline is considered
From: louis@media.berkeley.edu
“white space” and is From: zqian@umich.edu
stripped From: rjlowe@iupui.edu
....
Skipping with continue

fhand = open('mbox-short.txt')
We can conveniently for line in fhand:
skip a line by using the line = line.rstrip()
if not line.startswith('From:') :
continue statement continue
print(line)
Using in to Select Lines
fhand = open('mbox-short.txt')
We can look for a string for line in fhand:
anywhere in a line as our line = line.rstrip()
if not '@uct.ac.za' in line :
selection criteria continue
print(line)

From stephen.marquard@uct.ac.za Sat Jan 5 09:14:16 2008


X-Authentication-Warning: set sender to stephen.marquard@uct.ac.za using –f
From david.horwitz@uct.ac.za Fri Jan 4 07:02:32 2008
X-Authentication-Warning: set sender to david.horwitz@uct.ac.za using -f...
fname = input('Enter the file name: ')
try:

Bad File fhand = open(fname)


except:
print('File cannot be opened:', fname)

Names exit()

count = 0
for line in fhand:
if line.startswith('Subject:') :
count = count + 1
print('There were', count, 'subject lines in', fname)

Enter the file name: mbox.txt


There were 1797 subject lines in mbox.txt

Enter the file name: try this


File cannot be opened: try this
Writing to a file
• If the file already exists, opening it in ‘w’ mode clears out the old
data. If it does not exist, a new file is created.
>>> fout = open('output.txt', 'w')
>>> line1 = “Hello there,\n"
>>> fout.write(line1)
• The print function automatically appends a newline, which is not
the case of the write method.
• Calling write again appends the data to the end of the file
• Better to close the file once done: fout.close()
Summary
• Secondary storage • Searching for lines
• Opening a file - file handle • Reading file names
• File structure - newline character • Dealing with bad files
• Reading a file line by line with a
for loop
Exercices
• Write a program that prints the content of a file line by line, all in upper
case. Then indicate the number of lines and words in that file.

• Count the number of occurrences of a specific word in a file (e.g.


‘Received’ in the file mbox-short.txt)

• Create a new file that contains any combination of: whitespaces, newlines,
and tabs. Write a program that prints the content of the file (by printing ‘\t’
instead of a blank space representing a tab, and ‘\n’ for a newline), then
output the number of letters in the file, and also the number of ‘tabs’.

Potrebbero piacerti anche