Sei sulla pagina 1di 4

Sequential File Organization

CS 102  sometimes abbreviated as SFO

File Structures &  oldest type of file organization developed in
File Organizations
 initially associated with magnetic tapes, the first
secondary storage devices available.
 now also found in direct-access devices like
Chapter 04 disks and flash memory
Sequential File Organization  examples : many text files and some binary files


Sequential File Organization Batch Processing

 Records are stored and accessed consecutively What is it ?
in sequence from beginning to end  Collecting a group of transactions, then applying the
transactions to master files at one time.
 The physical order is usually the same as the
logical order  Contrast this with Real-Time processing where each
transaction is applied immediately to master files.
 Records are usually in ascending or descending When is it used ?
order by the key field
 For processes that are performed on regular (say, daily)
 On average, half the records must be accessed intervals such as cheque processing, interest accruals,
to locate a record of interest reporting, etc.

 The entire file must usually be copied in order to  For operations that are not very time-sensitive.
update it  To save total processing time: by improving file access
time, or by reducing network delays.
 Records may be fixed or variable length

Editing Transactions Validating Transactions

 Transaction files are inputted/edited in a front-end  Transaction files are validated so data make sense.
editing program  The files are edited and validated before the
The files are created by encoding transaction maintenance run to identify invalid data
details into files with the desired layout. Examples:
Are all required field values available ?
Are the values of fields within the expected range of
Are the values of fields of the correct type (numeric,
alphabetic) ?
Are the values of a combination of fields compatible?
Are there duplication of records ?


Master File Update The Batch Process
 Updates to the master file are accumulated or Transaction
batched in a transaction file File

 Each transaction contains the the key value of

the master file record to be updated. Mainte-
 The transaction file is sorted in the same key Run

order as the master file

Audit File
 The transactions are applied, using a merging Error File
New Master File
algorithm, to the master file in a maintenance run
 Table files may be consulted as part of the
editing, validation and maintenance runs. Table files may be consulted as part of the editing,
validation and maintenance runs.

Transaction Codes Addition

Each transaction has an update code or  Addition transactions contain values for all fields
transaction code indicating the type of update : of the master file

 usually : Key values of records to be added must not

exist in master file before the update.
 A for add,
For non-keyed files, new records are
 C for change, modify or update and appended at the end of existing file.
 D for delete For keyed files, new records must be inserted
in their proper positions to maintain key-order.
Master files may initially be created by
applying a sequence of add transactions.


Change Deletion
 Change transactions contain field values that are  Delete transactions only need to specify the
to be changed on the master file transaction code and key value.
All records to be modified must exist in master All records to be deleted must exist in master
file before the update. file before the update.
The master file record is read into main These records must not be included in the
memory and updated before it is copied into new master file.
the new master file.
Deleting these records require gaps to be


Multiple Transactions Per Key Audit and Error Files
Files or Listings produced by maintenance runs :
 There may be zero, one or more transactions per
key value in the transaction file.  Audit File :

An addition, several changes and a deletion also called control listing or audit trail
(or other combinations) may have been contains maintenance run …
transacted since the master file was last details (date, time, location, etc.) and
statistical summary of transaction updates (old and
Transactions for the same key value must be new master file record counts, transaction count,
sorted by date/time or sequence number. successful and unsuccessful add, change delete
counts, etc.)
 Error File :
invalid transactions and processing errors

Time Between Runs The Sentinel Value

 Longer time between maintenance runs What is it ?
results in more outdated master files  Typically used for end-of-file conditions
larger number of accumulated transactions  A value that is higher or lower than any key value
next maintenance run takes longer than can occur on a file
 In COBOL, this is represented by the constants
 Shorter time between maintenance runs
 Mathematically, you can look at this as
results in more current master files
+infinity (+ ∞) or –infinity (- ∞)
more costly
 In other languages, programmer specifies actual

The Balance Line Algorithm The Balance Line Algorithm

Algorithm Sequential_Update_Main Algorithm Get_Next_Trans
open master_file, transaction_file for input read transaction_record from trans_file
open new_master_file for output at end trans_key  sentinel
Get_Next_Master // trans_key is a field of transaction_record
current_key = min(trans_key, master_key) Algorithm Get_Next_Master
read master_record from master_file
while current_key <> sentinel at end master_key  sentinel
if master_key = current_key // master_key is a field of master_record
hold_master  master_record Algorithm Process_Trans
write_hold_master  yes if write_hold_master = yes
Get_Next_Master case update_code
else ‘A’ : print “Duplicate add error”
write_hold_master  no ‘C’ : change hold_master with transaction_record
while trans_key = current_key ‘D’ : write_hold_master  no
Process_Trans else
if write_hold_master = yes case update_code
write hold_master to new_master_file ‘A’ : build hold_master from transaction_record
current_key = min(trans_key, master_key) write_hold_master  yes
‘C’,’D’ : print “No matching master record error”
close master_file, transaction_file, new_master_file Get_Next_Trans

Case Analysis Algorithm Simulation
Case 1 : master_key < trans_key Transaction File
Master File Student_Nu Seq_N Updte Tuition_
Case 2 : master_key > trans_key Student_Nu Tuition_ mber (Key) umber _Code Balance
 Case 2a : one transaction for a non-existent mber (Key) Balance 2009000010 1 A 25000.00
master_key (must be an ADD) 2009000020 5000.00
2009000020 1 C 0.00
2009000040 20000.00 2009000030 1 A 25000.00
 Case 2b : multiple transactions for a non-existent
2009000060 10000.00 2009000030 2 C 8000.00
master_key (first must be ADD)
2009000070 15000.00 2009000050 1 D
Case 3 : master_key = trans_key 2009000090 0.00 2009000060 1 D
 Case 3a : one transaction for an existing 2009000100 9000.00 2009000070 1 C 12000.00
master_key (must be CHANGE or DELETE) 2009000070 2 D
 Case 3b : multiple transactions for an existing 2009000080 1 A 7000.00
master_key (first must be CHANGE or DELETE) 2009000090 1 A 1000.00

Output Master File Output Error File

Master File Error File

Student_Number (Key) Tuition_Balance Student_Nu Seq_Nu Update Tuition_ Error_Reason
2009000010 25000.00 mber mber _Code Balance
2009000020 0.00 2009000050 1 D not in master
2009000030 8000.00 2009000090 1 A 1000.00 already in master
2009000040 20000.00
2009000080 7000.00
2009000090 0.00
2009000100 9000.00


SFO as Input-Output Files End

Sequential files may be opened so reads and writes of
records could be interspersed.
 Requires a direct access device such as a disk.
 All records to be added are appended at the end of the file.
A sort of the file would be required after file maintenance.
 The changed records may just be rewritten in place (ex. in
COBOL using REWRITE command.)
 Deleted records are just marked as deleted. Periodically,
marked deleted records must be physically deleted to
minimize processing and storage overhead.
 In-place update does not create a new master file so a
backup of master file before maintenance is required.