Sei sulla pagina 1di 68

Guideline for the

Digitisation of Paper
Records

© Queensland State Archives 2006


Version 2 April 2006
Contents
1: Introduction ............................................................................................................................. 4
1.1 Authority ............................................................................................................................ 4
1.2 Scope ................................................................................................................................ 4
1.3 Exclusions ......................................................................................................................... 5
1.4 Acknowledgments ............................................................................................................. 5
2: Management of Original Records and Scanned Images ..................................................... 6
2.1 Responsibilities for recordkeeping .................................................................................... 6
2.2 Management of original paper records.............................................................................. 6
2.3 Management of imaged records........................................................................................ 7
3: Issues to Consider Before Commencing Digitisation ......................................................... 9
3.1 Why digitise? ..................................................................................................................... 9
3.2 Which records will be digitised? ........................................................................................ 9
3.3 How will digitisation integrate to the existing workflow? .................................................. 10
3.4 How will the digitised records be managed? ................................................................... 10
3.5 Authorised disposal of paper originals ............................................................................ 11
3.6 How will the digitised records be used? .......................................................................... 11
3.7 Who is responsible for the digitisation project? ............................................................... 11
3.8 Will digitisation be outsourced?....................................................................................... 11
4: Components of a Digitisation Program .............................................................................. 13
4.1 Computer Hardware ........................................................................................................ 13
4.2 Computer Software ......................................................................................................... 14
4.3 Procedures and Standards.............................................................................................. 15
4.4 Staff ................................................................................................................................. 17
5: Authorisation for Early Disposal ......................................................................................... 18
5.1 Introduction...................................................................................................................... 18
5.2 Overview of process........................................................................................................ 18
5.3 Determining whether records are eligible........................................................................ 18
5.4 Ensuring appropriate systems and procedures............................................................... 21
5.5 Obtaining authorisation ................................................................................................... 23
5.6 Monitoring and review ..................................................................................................... 24
6: Technical considerations ..................................................................................................... 25
6.1 Resolution ....................................................................................................................... 25
Recommended Resolutions.................................................................................................................... 27
6.2 Bit Depth.......................................................................................................................... 28
Recommended Bit Depths ...................................................................................................................... 31
6.3 Compression and File Size ............................................................................................. 31
Recommended Compression ................................................................................................................. 33
6.4 File Formats .................................................................................................................... 33
Recommended File Formats................................................................................................................... 38
6.5 Quality Control................................................................................................................. 39
Recommended Quality Checks .............................................................................................................. 42
6.6 Master Files and Derivatives ........................................................................................... 42
Recommended Derivatives..................................................................................................................... 45
7: Metadata................................................................................................................................. 46
7.1 Metadata Types............................................................................................................... 47
7.2 Capturing Metadata......................................................................................................... 49
7.3 File Naming Conventions ................................................................................................ 49
Recommended Metadata........................................................................................................................ 51
Queensland State Archives: Guideline for the Digitisation of Paper Records

8: Storage and Media Options.................................................................................................. 52


8.1 On-line, Near-line, and Off-line Storage .......................................................................... 52
8.2 Media Types.................................................................................................................... 53
8.3 Media Lifecycle................................................................................................................ 54
Recommended Storage Options ............................................................................................................ 55
Appendix 1: Glossary of Terms and Acronyms......................................................................... 56
Appendix 2: Scanner Types......................................................................................................... 60
Appendix 3: Table of Technical Recommendations .................................................................. 61
Appendix 4: Related Standards................................................................................................... 62
Appendix 5: Reference List.......................................................................................................... 65

3
Queensland State Archives: Guideline for the Digitisation of Paper Records

1: Introduction
Digitisation is the process of converting any physical or analogue item into an electronic
representation1. In the context of this guideline, digitisation refers to the creation of digital
images from paper documents by such means as scanning or digital photography.
Queensland State Archives has produced this guideline to provide information to public
authorities about digitisation, to recommend suitable digitisation parameters and to raise
awareness of the recordkeeping factors associated with digitisation.
Many Queensland public authorities have implemented or are considering implementing
systems to digitise their paper records. In most cases, these projects are undertaken with
the goals of
• achieving faster retrieval of information;
• improve access to information, by;
• greater sharing of information; and
• the reduction of the storage space required for paper records.
There are many aspects of digitisation. While the acquisition of scanners and associated
computer hardware may be the initial action that comes to mind when digitisation is
discussed, successful digitisation requires several components, including procedures,
standards, computer software, and appropriately skilled staff.
1.1 Authority
This guideline is issued under Section 25 of the Public Records Act 2002 (the Act). The
guideline is a resource for Queensland public authorities to help them achieve best
practice recordkeeping and information management. This publication is intended to serve
as a guide to public authorities undertaking or considering undertaking digitisation.
Information and recommendations provided in this guideline are considered to be accurate
at the time of publication. Queensland State Archives reserves the right to withdraw,
amend or replace this guideline at any time as technology and the needs of public
authorities change.
1.2 Scope
Paper records exist in a variety of formats including maps, plans, photographs and other
documents of various colours, paper types and sizes.
This guideline provides digitisation recommendations that are broad enough to apply to the
majority of paper records applicable to most public authorities. In some cases, particular
characteristics of different types of paper records may call for different technical
parameters and approaches from those included here. Public authorities should combine
the guidance provided in this document and advice from digitisation-related computer
hardware and software vendors with their own testing to determine the optimum
parameters for their organisation.
This guideline also provides information on how public authorities can apply for
authorisation for the early destruction of certain temporary records that have been

1
Tanner, S. From Vision to Implementation – strategic and management issues for digital collections. 2000. The Electronic Library –
strategic, policy and management issues seminar. Accessed March 2005 at http://heds.herts.ac.uk/resources/papers/Lboro2000.pdf

4
Queensland State Archives: Guideline for the Digitisation of Paper Records

digitised, following the Digitisation Disposal Policy: Policy on the authorisation of the early
disposal of original paper records after digitisation.
Key technical terms have been explained and illustrated with examples and a
comprehensive glossary has been included in Appendix 1: Glossary of Terms and
Acronyms. Definitions of records management terms can be found in Queensland State
Archives’ Glossary of Archival and Recordkeeping Terms2.
1.3 Exclusions
The conversion of other analogue records, such as video or audio recordings, into a digital
form is outside the scope of this document. Likewise, the management of information that
originates in a digital form, such as word processing documents, e-mails, and other born-
digital items is not included.
This guideline does not provide advice on high-quality digitisation of historical documents
for preservation purposes.3
While advice will be provided on some generic features that should be possessed by
computer hardware and software used in the digitisation process, this guideline will not
provide recommendations for particular models of computer equipment or software titles.
Queensland State Archives is unable to provide advice on systems and network
architecture issues relating to digitisation. Public authorities should refer to their existing
computer systems administration and implementation procedures for technical and
systems issues.
1.4 Acknowledgments
Queensland State Archives would like to acknowledge the public authorities which
participated in the development of the Guideline for the Digitisation of Paper Records.

2
Glossary of Archival and Recordkeeping Terms. 2004. Queensland State Archives. Accessed March 2005 at
http://www.archives.qld.gov.au/downloads/GlossaryOfArchivalRKTerms.pdf
3
For information on preservation digitisation of archival or historical documents, please contact Queensland State Archives.

5
Queensland State Archives: Guideline for the Digitisation of Paper Records

2: Management of Original Records and Scanned Images


While there are clear benefits that the digitisation of paper records can bring, it is important
that public authorities are aware of the related recordkeeping issues. There are a number
of challenges in ensuring that digitised paper records remain accessible and useable.
Meeting these challenges should be a key focus when considering digitisation technical
requirements.
Digitised files can be considered accessible if they can be easily identified, retrieved, used
and maintained. To be considered useable and accessible within an organisation,
descriptions of the files and procedures on how they can be accessed must be established
and published.
Any digitisation program should be carefully planned to meet appropriate standards and
avoid the need to repeat work. Consideration must also be given to the categorisation and
storage of the original paper documents that are digitised. Once digitised, the paper
records still need to be kept for their respective retention periods unless disposal
authorisation is given by the State Archivist. The conditions for early destruction of paper
originals are outlined in the Digitisation Disposal Policy and section 5 of these guidelines
provides information for public authorities on seeking authorisation in accordance with the
policy.
This guideline examines important digitisation issues regarding accessibility and usability
of digitised paper records including file formats, image qualities, the way the image files
are stored and the process that is adopted to accomplish the digitisation. Issues
associated with usability, such as the information that is recorded to describe both the
record and allow it to be accessed, are also considered.
2.1 Responsibilities for recordkeeping
Under Section 7 of the Public Records Act 2002 (the Act), a public authority must make
and keep full and accurate records of its activities, and have regard to any relevant policy,
standards and guidelines made by the State Archivist about the making and keeping of
public records.
Queensland Government legislation and standards relevant to digitisation includes:
• Public Records Act 2002
• Evidence Act 1977
• Information Standard 31: Retention and Disposal of Public Records
• Information Standard 40: Recordkeeping
• Information Standard 41: Managing technology dependent records
These standards and legislation should be considered in conjunction with the guidance
provided in this document. Additional legislation, standards and policies which apply to
individual public authorities or industries may also need to be consulted. Reference
should also be made to the Australian Standard on Records Management, AS-ISO15489.
2.2 Management of original paper records
Original paper records must be kept for the retention periods set out in an approved
retention and disposal schedule unless authorisation has been obtained from the State
Archivist for their earlier disposal. Under Section 13 of the Act, no public record can be
disposed of without the permission of the State Archivist.
In considering whether to seek authorisation for early disposal, public authorities should be
aware of any legislative or regulatory requirements to maintain records in a particular

6
Queensland State Archives: Guideline for the Digitisation of Paper Records

format. Public authorities should also assess the need to maintain records in their original
form for legal purposes and should seek legal advice if unsure of any requirement. By
scanning original records to a digital format, and retaining only the digital version, public
authorities may be disadvantaged if called upon to authenticate certain records. The
Digitisation Disposal Policy sets out what records are eligible for authorisation for early
disposal and under what conditions. For more information on the authorisation process,
see section 5 of these guidelines.
Day batching
Some public authorities have adopted the practice of day batching, which involves filing
the paper originals of imaged records in batches based on date received or scanned.
Batching places a heavy reliance on the system used to manage the digitised records and
introduces a number of issues including:
• the risk of losing vital contextual information about the business the records document
and their relationship with other records,
• the inability to effectively implement a disposal program, since records batched
together may have different retention periods, and
• the refusal of Queensland State Archives to accept for transfer into its custody
temporary records contained in a batch also holding permanent public records.
Batching is usually associated with the digitisation of new records. It should be noted that
any records which are removed from structured files for any purpose, including digitisation,
must be returned to the file from which they were removed. Further information can be
found in the Public Records Alert, Day batching of records4.
2.3 Management of imaged records
Imaged records require careful management. There is a high risk of technical
obsolescence of hardware and software needed to retrieve information from electronic
storage media. A public authority needs to ensure that its recordkeeping system can
maintain authentic, accurate, complete and accessible imaged records for as long as they
need to be retained. A management plan dealing with the procedures for migration of data
is required to cater for systems being replaced and equipment becoming obsolete.
Some general principles apply to the retention of original paper records and their digital
copies:
• The paper original should be kept for the full period specified in an approved retention
and disposal schedule, unless an early disposal authorisation is granted (see section 5:
Authorisation for early disposal).
• If the image becomes part of a file with other records, for example, in an eDRMS
environment, it should be kept in accordance with the retention and disposal period
given to the parent file. Records should never be removed or ‘culled’ from files.
• An image made purely for access or reference purposes can be destroyed when
reference ceases in accordance with the General Retention and Disposal Schedule
(GRDS) for Administrative Records, class 6.1.2 for duplicate copies of records.
There are some exceptions to these general principles. For example, if key business
decisions, approvals or comments are closely associated with the image copy of a record,
such as in a workflow system, the image and the associated information should be kept for
the full retention period.

4
Public Records Alert No 1/05: Day batching of records. 2005. Queensland State Archives. Accessed March 2005 at
http://www.archives.qld.gov.au/publications/PublicRecordsAlert/PRA105.pdf

7
Queensland State Archives: Guideline for the Digitisation of Paper Records

The paper original may only be disposed of before the digitised image with the explicit
permission from the State Archivist through an approved retention and disposal schedule.
It should be noted that retention and disposal schedules set minimum periods for retention.
Public authorities occasionally have a need to keep records for longer than the approved
retention period. In this situation, if authorisation has not been given for the early
destruction of the original records, the principle of not disposing of the paper original
before the digitised image still applies.
Information Standard 40: Recordkeeping and the Australian Standard AS ISO 15489:
Information and Documentation: Records Management should be consulted for general
advice on the principles and practices for the management of scanned images as a public
record.
Authenticity
Authentic records are those that can be proven and trusted to be what it purports to be and
to have been created, used, transmitted or held by an agency or person to whom these
actions have been attributed5. Public authorities will need to be able to verify the
authenticity and accuracy of the images of business transactions captured by scanning.
The original records must remain readily accessible long enough to allow for verification
that procedures related to the capture of records have been followed.
Measures should be in place to protect the authenticity of the scanned records throughout
their lifecycle. Information about the scanning processes should be maintained, including
documentation about the business processes and the maintenance of systems, to
demonstrate that public records were created and captured in the normal course of
business with reliable systems and equipment. Documentation about the records that are
scanned should be maintained to describe the structure and content of the records and the
business context in which they are created and captured.
Copyright, Intellectual Property and Privacy
Most public authorities will digitise records for ease of information sharing within the
organisation. However, once digitised, information is in a form that makes it easier to
distribute to a wider audience. Any public authority intending to make information
available to a broader audience, for example, publishing images to a website, should be
aware of any copyright, intellectual property and privacy implications.

5
Glossary of Archival and Recordkeeping Terms. 2004. Queensland State Archives. Accessed March 2005 at
http://www.archives.qld.gov.au/downloads/GlossaryOfArchivalRKTerms.pdf

8
Queensland State Archives: Guideline for the Digitisation of Paper Records

3: Issues to Consider Before Commencing Digitisation


Careful consideration should be given to all aspects of digitisation prior to the commitment
to the project by such actions as the acquisition of equipment or the recruitment of staff.
There are several questions that should be addressed to assess the value and
effectiveness of digitisation. These are raised and discussed over the following pages.
3.1 Why digitise?
Digitisation of records is typically undertaken with the aim of achieving faster retrieval of
information, easier transmission of information, greater sharing of information and the
reduction of the storage space required for paper records. Driving factors behind
digitisation projects may include:
• to provide easier and improved access to the records;
• to improve the internal transfer or dissemination of the records;
• to integrate digitised records with an electronic documents and records management
system (eDRMS), other systems, applications or websites; and
• to reduce management & access costs.
Benefits from the process, such as improvements to productivity resulting from the better
access to records and the information they contain, should be clearly defined and will need
to be quantified and communicated to management and relevant staff.
Costs of digitisation should be compared with the cost of “doing nothing”, and the issues of
continuing to use only paper records, including lack of accessibility, poor integration into
modern business systems and inconvenience should be clearly understood.
Available resources, including personnel, technology and money, should be assessed.
The organisation’s technical infrastructure requires the ability to cope with the ongoing
operation of the system, and funds need to be allocated to the regular maintenance and
update of the systems implemented.
3.2 Which records will be digitised?
The type of equipment required, the number of staff, and the financial resources are
dependent upon the volume of records that will be digitised. By monitoring the amount of
paper records that are generated by and sent to the public authority, and investigating the
need to digitise existing paper records, an estimate can be made as to the number of
pages per day that the organisation can expect to be digitised.
The physical characteristics of the typical paper records should also be assessed to assist
in determining the required specifications for digitisation equipment. For example, large
format maps and plans, double sided correspondence, and colour documents may not be
able to be fully and accurately captured using equipment designed for scanning black and
white documents. The proportion of different physical record types that could be digitised
should be analysed and consideration given to the acquisition of specialist equipment.
For a public authority to make the investment in digitisation, there must be a drive or
business need to access paper records in a digital format. Unless the size of the paper
records collection is very small and static, it will not usually be feasible or warranted to
scan all existing records. Consideration should be given to the purpose of the digitisation
program. Generally, digitisation is undertaken to improve access to records and to
integrate paper records into an otherwise electronic business process.

9
Queensland State Archives: Guideline for the Digitisation of Paper Records

Decisions on whether or not to digitise a particular record will need to be made on a day to
day basis once digitisation is available to the organisation. A number of questions should
be asked when selecting records for possible digitisation:
• Is there a benefit of digitising this record?
• Is the original suitable for digitising?
• Is the equipment able to fully capture the content of the record?
• Is the original part of a series that also needs to be digitised?
• Are there any special characteristics of the record – eg: colour, double sided, faded?
Some records are physically less suitable for scanning than others. For example, large
format records, bound volumes, photographs, plans and maps, records with reflective
surfaces or fragile material require specialised scanning equipment and techniques.
Other records, such as those handwritten in coloured ink, on coloured paper, or double
sided paper may be accommodated by the available scanning equipment, but need to be
separated and scanned in a different batch with modified settings.
There may be a business decision made not to scan some records. Deciding to digitise
existing paper records in addition to new records can be a large undertaking, and careful
analysis should be conducted to gauge the benefit of doing this. Ideally, existing paper
records that are frequently accessed should be digitised, maximising the benefits of
digitisation. Some records may have such short retention periods that the expense of
digitising them is not warranted.
It is important that staff are made aware of what has been digitised and what hasn’t so that
they will benefit from the convenience of faster access to digitised records without
searching in vain for digitised copies of records that have not been scanned.
3.3 How will digitisation integrate to the existing workflow?
It is essential to the success of a digitisation program that the existing records
management procedures are investigated prior to the introduction of new techniques.
Decisions need to be made on a number of aspects including when the records will be
scanned, how end users are presented with the records, and how will the original paper
records be managed after scanning.
A good understanding of existing practices will not only present the opportunity to integrate
digitisation at the most appropriate stage, but will also provide a point of reference to
measure the performance of digitisation. The introduction of digitisation may also provide
the impetus to streamline business processes around digitised records.
3.4 How will the digitised records be managed?
As described later in section 4: Components of a Digitisation Program, a system to
manage the digitised records is arguably the most important component of the digitisation
system. It is essential that a system is in place that enables access by appropriate
authorised personnel, allows digitised records to be easily found, includes measures to
preserve the authenticity of the records, and provides information about the record and its
context. This descriptive information, known as metadata, is discussed further in section
7: Metadata.
Time related factors encountered in records management, such as record retention
periods, and destruction dates also apply to digitised records. Additionally, the
obsolescence of technology and the deterioration of storage media are time related factors
that are introduced by digitisation and should be addressed in management plans.

10
Queensland State Archives: Guideline for the Digitisation of Paper Records

3.5 Authorised disposal of paper originals


Agencies should also consider whether they have authorisation, or intend seeking
authorisation, for the early disposal of original paper records after digitisation. If so,
digitisation procedures and processes should be developed in accordance with the
Digitisation Disposal Policy. For more information, see section 5: Authorisation for early
disposal.
In addition, as authorisation is only possible for certain classes of records, workflows will
need to be reviewed to ensure that the appropriate disposal classes for records are
allocated at the point of digitisation, and that strategies are in place to identify those
records eligible for early disposal.
3.6 How will the digitised records be used?
Digital objects, such as scanned records, are easier to distribute than their analogue
equivalents. This has the potential to provide larger audiences with increased access to
digitised records than by physical access to paper records. If digitised records are to be
made publicly available, there may be a need to investigate any security, copyright, or
intellectual property issues that this raises.
Consideration should also be given to the mode of use of the digitised records. For
example, there are different image quality requirements for on screen viewing of images
than for printed images. Images which are to be accessed on a low bandwidth connection
may include less information and be created at a lower quality than those viewed directly
from a CD-ROM or local area network.
3.7 Who is responsible for the digitisation project?
Like any large project, there should be clear ownership of the digitisation program by an
individual or work unit within an organisation. If considerable resources are allocated to
establishing an ongoing digitisation program, it makes sense for the program to run for a
considerable period. For this to occur, sufficient funding should be secured not only for the
implementation of digitisation, but also for the ongoing maintenance, routine costs, and
system upgrades and replacement.
3.8 Will digitisation be outsourced?
Public authorities should carefully consider the pros and cons of either outsourcing
digitisation projects or conducting them in-house. The types of records being digitised, the
digitisation requirements of the public authority and the geographic location may all have
an affect on the suitability of outsourcing and the availability of external parties to carry out
the work.
There are several benefits of carrying out digitisation within the organisation including:
• Control over the entire imaging process, how the digital copies of the records are
arranged and stored, and the handling and storage of the original paper records,
• Security of the records is controlled by restricting the access to known staff,
• Experience in project management and digital imaging and exposure to technology
and techniques gained which may be transferable to other projects, and
• Flexibility to alter project requirements and digitisation parameters as the project
develops, rather than having them locked in a contract.
However, there are drawbacks of this approach that need to be considered including:
• Large investment of financial, IT and human resources, both initially and throughout
the project lifecycle,

11
Queensland State Archives: Guideline for the Digitisation of Paper Records

• Time needed to implement a digitisation process and associated technical


infrastructure, with initial production levels and efficiency typically limited,
• Staffing expertise not always available, and any training investment will be wasted if
staff leave, and
• Responsibility for network downtime, equipment failure and obsolescence, training of
staff, and adherence to standards and best practices.
In cases where digitisation processes are outsourced, external parties should be made
aware of relevant standards. This should include this guideline, IS40 and IS41, and any
standards or practices that may apply to individual public authorities. Outsourcing
arrangements made as part of a digitisation project should also comply with any existing
contracting or purchasing policies that may be in place.
The benefits of engaging an external party for digitisation include:
• Costs are more predictable as the digitisation of the paper record is what is paid for,
usually as a cost per page, not equipment or staffing,
• High production levels and fast completion as equipment and staff are tested and
already in place,
• Expertise and experience of the specialist can be drawn upon,
• Risk is lower, as the vendor accepts the costs of technology obsolescence, failure,
downtime, staff changes, etc, and
• Economies of scale can be realised as specialist scanning bureaux will usually be
carrying out digitisation on behalf of a number of clients. Unless a public authority has
a huge volume of records to digitise over a long period, outsourcing will usually be
more cost effective.
These benefits are offset by some negative aspects of outsourcing including:
• Limited control over how the digitisation is carried out,
• Complex contractual process of determining the specifications for the digitised
records, research, negotiation, and communication with the vendor will take some
resources, so outsourcing cannot be seen as a totally “hands off” approach,
• Knowledge gap between the vendor and the client may cause delays and confusion.
The vendor will have experience and skills in scanning technology and related
practices, but will not know the business of the client,
• Risk of the vendor going out of business or altering their practices,
• Quality control duplication as it is required by both parties,
• Transportation and handling of the originals records introduces avenues of possible
loss or damage to the records, and
• Security and privacy issues of the vendor’s staff having access to records which may
be private or confidential6.
The outsourcing of digitisation should not be thought of as an easy solution that simply
requires financial resources. Establishing the requirements and specifications for
digitisation, progressing through to a tender program or purchase decision, and then
liaising with the vendor and monitoring their work will require time and staff allocation.

6
Adapted from Western States Digital Imaging Best Practices Version 1.0. 2003. Western States Digital Standards Group. Accessed
March 2005 at http://www.cdpheritage.org/resource/scanning/documents/WSDIBP_v1.pdf

12
Queensland State Archives: Guideline for the Digitisation of Paper Records

4: Components of a Digitisation Program


Given the wide variation in size of public authorities, there is not a universal specification
for a paper records digitisation solution. However, the components described below,
implemented at various scales, should form part of any digitisation program.
4.1 Computer Hardware
Scanners or other digital imaging devices
The physical process of converting a paper document into a digital image requires a
scanner or other digital imaging device to capture the image of the document, (collectively
referred to as “scanners” for the remainder of this document) a computer to control the
scanner and to provide the processing required for the conversion, as well as computer file
storage of some form to keep the resulting image. Typically, a single computer is required
for each scanner operator.
Some common scanner types are briefly described in appendix 2: Scanner types.
Computers
Computers are an integral part of the digitisation process, required for input, management,
storage, and distribution. The scanner will usually need to be connected to a computer,
and often this computer will be used for setting the parameters of the scanning and
carrying out initial quality control. Computers will also be required for the entry of
metadata for scanned documents, often referred to as profiling. Computers used for these
interactive tasks which require the operator to continuously visually check the screen
should be equipped with a quality graphics adapter and a large monitor that allows the full
page to be viewed.
Computers will also be required to carry out less interactive tasks including file storage,
indexing, copying and backup. These roles may best be carried out using file servers or
other computers that do not necessarily require the intervention of an operator.
When choosing computers to use for digitisation, emphasis should be placed on selecting
equipment that is capable of efficiently handling the demands of digitisation, while still
being compatible with the public authority’s existing computer equipment, support
arrangements and standards. An organisation’s ICT specialists should be consulted to
check this prior to the purchase of computer equipment.
In a small digitisation implementation, all of the tasks described here may be performed by
a single computer, but as the implementation grows, or for large scale digitisation, the
tasks can be separated out to various computers on the network. Scanning and other
image related tasks are typically demanding of computer memory and processing, but
most current model PCs would be suitable for digitising. To avoid interruptions to the
workflow of the digitisation process, it is recommended that even in small scale
implementations, a computer be dedicated solely to the digitisation process.
Computer storage, backup devices, and storage media
Scanned records, their descriptions and other associated information need to be stored in
a manner which promotes access, security, and longevity. For small digitisation programs,
storage on the scanning computers local hard drive and back-up to a writeable CD or DVD
drive may fulfil this purpose. For larger implementations, file servers may be used for the
storage of digitised records, with tape drives used for back-up.
So that the integrity of the digitised records can be guaranteed, a mechanism should be in
place to prevent the unauthorised deletion or modification of stored information. The write-

13
Queensland State Archives: Guideline for the Digitisation of Paper Records

once nature of some optical media is an inexpensive means of doing this, however, as the
amount of digitised material grows, the slow access times and handling required to use
removable media may not be suitable. In these situations, software security solutions,
such as those included in many eDRMS, should be implemented to provide a similar
assurance of integrity for files stored on line using hard drives and file servers.
Section 8: Storage and media options, covers these and other related issues in detail.
4.2 Computer Software
A system for describing and managing the digitised records
A system to manage the digitised records is arguably the most crucial component of a
digitisation program. The successful implementation of this software should ideally be
completed prior to the commencement of scanning of paper records and the acquisition of
such a system should be a high priority task. There is little use in commencing the
scanning of paper records if there is no established method of recording what has been
scanned or storing descriptions of those scanned records.
For the management of digital images as records, systems that are designed to fulfil
records management needs should ideally be used. The effort and resources required to
adapt a system not designed for the management of digital records should be analysed
and a business decision made as to whether the introduction of a new system with the
required capabilities would be a more effective use of resources.
If new systems, such as eDRMS, are to be introduced into an organisation to manage
digitised paper records, the system should comply with the guidance provided in this
document in such areas as metadata and file formats, and integrate into the organisation’s
existing ICT infrastructure. Resources will need to be allocated to training of staff and
technical support of any systems that are introduced.
Existing systems employed by public authorities to manage paper records may also have
the capability to also link to digital images of the records they describe. If this is not a
feature of the available records management software, provided that the software is able
to provide descriptions of individual items within a file, a file path to the digital images and
other information could be recorded as comments.
As an entry level solution, or as a temporary measure pending the introduction of a
specialised system, a record of what has been scanned, where the digital copies can be
found, and relevant details of the scanning process can be a stored in a spreadsheet or
small database.
There are also many free or low cost image cataloguing applications designed principally
for management of digital photographs that could be used for managing digital copies of
paper records. The features of such systems should be closely examined prior to
implementation, even if only as a temporary or interim measure. A large effort may be
required to adapt non-recordkeeping systems to document scanned records, and much of
this will need to be repeated when a more appropriate longer term solution is employed.
Imaging software for capture and manipulation
Scanners are bundled with software (known as a driver) that is required for the controlling
computer and the scanner to communicate. Additional software is typically included with
the scanner which allows scanning, calibration and some post scanning image processing
operations to be performed. This software will usually be thoroughly tested by the scanner
manufacturer to work optimally with their hardware, with features appropriate to the type of
scanner purchased. For example, software bundled with high speed sheet fed scanners

14
Queensland State Archives: Guideline for the Digitisation of Paper Records

would likely include features which would allow a choice to be made between single sided
and duplex scanning, and recognition of barcodes while the software that comes with slide
scanners would typically include features for magnifying the originals and reversing the
colours of negatives.
If the bundled software does not fulfil a public authority’s scanning needs it may need to be
supplemented or replaced by software which is purchased separately. Additional image
processing software may also be used for such tasks as the conversion of file formats,
deriving of related files, and modification or enhancement of images. Some degree of
interoperability usually exists between most image processing and scanning software so
that images can be acquired directly into the image processing software. To enable full
text searching of scanned paper documents, optical character recognition (OCR) software
would also be required. This is described and discussed further in section 6.6: Master files
and derivatives.
Security and access control
Just as the access to many paper records is monitored and typically restricted to
authorised staff, access to digitised copies should also be controlled in a secure digital
environment. As it would be a relatively straightforward exercise for a skilled operator
using free or inexpensive image processing software to change the appearance of
digitised records, security measures should be in place to prevent this type of
unauthorised tampering.
As described above and detailed in section 8: Storage and media options, some computer
storage scenarios, such as the use of write-once optical media, inherently prevent
modification of the image files once they have been stored. However, if other types of
computer storage are used, additional security provided by software with such features as
encryption, access control, and auditing should be employed.
For large digitisation programs that may be widely distributed throughout an organisation
and several staff need to add to and modify the collection of digitised records, a system
such as an eDRMS may be used to manage access to the information and to provide an
audit of system access and modification. In small scale digitising implementations,
security and access control may be provided through the use of a password protected
system by a single operator, with other authorised staff given read-only access. This could
be accomplished using the built-in security features of most current computer operating
systems.
4.3 Procedures and Standards
It is important to fully document decisions made about the digitisation process, including
technical, procedural and quality considerations. This is particularly important when
seeking authorisation for early disposal. The Digitisation Disposal Policy provides
information on the particular procedures required in this situation.
A method to identify which records are to be digitised
It is unlikely that all paper records within a public authority will be digitised. Identifying
those that are to be digitised prior to the implementation of digitisation may assist in setting
the requirements for a digitisation program and also determine the parameters for
subsequent digitisation. Internal policies and procedures should be developed and
relevant staff made aware of the criteria for deciding what paper records will be digitised.

15
Queensland State Archives: Guideline for the Digitisation of Paper Records

A workflow for the digitisation process


To help ensure that all of the relevant steps in the process of creating an accessible digital
copy of a paper record are carried out, a workflow should be developed and relevant staff
provided with instruction on the process. The workflow may be adapted from the
organisation’s existing records management practices or implemented as a new process.
The development of a workflow prior to implementation may assist in the estimation of
resources required for the digitisation process.
Standards
There are many international (ISO), Australian and industry standards that may be applied
to the digitisation process which are listed at appendix 4: Related standards.
When a public authority is considering implementing a digitisation project, existing
standards should be adopted and adhered to where possible, rather than establishing in-
house standards. This will promote consistency, increase the relevancy of research and
advice from other organisations which have implemented digitisation to the same
standards, and prepare the organisation for any potential collaboration. The adoption of
published standards will also provide vendors and potential outsourcing partners with an
unambiguous specification of what is required and will also provide a means of measuring
performance.
For some aspects of a public authority’s digitisation program, a recognised standard may
not be available. For these aspects, an internal standard or service level should be
established with its details made available to appropriate staff. As with the adoption of an
external standard, this will not only assist in setting clear targets for digitisation, but will
also provide a measure of performance.
A means of determining if the records were digitised correctly
A crucial part of the digitisation workflow is the verification that the images of the paper
records have been captured effectively to the required standard. Public authorities should
implement a checklist and establish which stages of the digitisation process are evaluated
and how often or what proportion of records are checked. Effective checking and control
will help avoid time consuming redundant activities such as the re-scanning of records that
have not been scanned correctly or retrospectively adding metadata for documents that
were not correctly profiled at the first attempt. This issue is covered in depth in section
6.5: Quality control, which includes a sample checklist.
Documentation on system maintenance
The performance of routine maintenance on the systems being used for digitisation is
crucial to the success of the program and should be documented and linked to the
digitisation workflow. This documentation should include a description of the system
backup strategy, contact details of system administrators and system technicians and
procedures for calibrating equipment. It is recommended that a disaster recovery plan
also be in place to provide for preservation of digital records and alternative access paths
to them in the event of unexpected failure of the primary systems.
Research and case studies of a variety of computer based systems suggest that
approximately one fifth of the resources required to implement a system should be

16
Queensland State Archives: Guideline for the Digitisation of Paper Records

allocated annually to maintenance, upgrades and training7. Allowance for this should be
made as part of the planning for digitisation.
4.4 Staff
Project management
Staff with business analysis and project management skills will be required to determine
the need for digitisation. They should also examine the workflow of current and new
processes to ensure that benefits of digitising are realised with minimal interruption to
business and effective communication and change management. Staff with these skills
may also manage the financial resources, negotiate with equipment and service suppliers
and prepare for the continued support, maintenance, and lifecycle management.
Technical experts
Digitisation involves the integration of computer hardware, imaging equipment and various
software packages to produce a managed collection of digitised records. Staff with
technical skills will need to investigate the various hardware and software options, and
work to bring the aims of the project to reality within time and budget constraints. These
staff may need to liaise with vendors, test different combinations and configurations of
equipment, and be responsible for acquisition, support, integration and maintenance of
equipment.
If an organisation’s IT help desk is expected to provide ongoing support to the digitisation
program, they should be made aware of any non-standard configurations required for
computers and contact details for the vendors of digitisation specific equipment.
Records management
Recordkeeping best practice applies to all records independent of its digital or paper
format. Recordkeeping controls and processes such as registration, classification or
profiling and appraisal and disposal will have to be applied to digitised records. Records
managers should be involved in the development of a digitisation program to ensure these
matters are addressed. On a routine basis once digitisation is underway, records
management staff will be involved in profiling and sentencing digitised images and also in
the retrieval and storage of paper records which are being digitised.
Records management staff without previous experience in managing digitised or other
technology dependent records should consult contemporary information management
research and guidelines to examine how digitisation will affect the management of records
within their organisation.
Equipment / computer operators
Personnel will be required to obtain the source paper records, operate the scanners, carry
out quality checks on scanned records and add metadata / profile information for the
digitised material. These staff should have a clear understanding of their task and a
workflow should be developed so that digitisation is regular and routine and meets
appropriate standards.

7
Revised Digital Imaging Guidelines for State of Ohio Executive Agencies and Local Governments. 2003. Ohio Electronic Records
Committee. Accessed March 2004 at http://www.ohiojunction.net/erc/imagingrevision/revisedimaging2003.html

17
Queensland State Archives: Guideline for the Digitisation of Paper Records

5: Authorisation for Early Disposal


5.1 Introduction
While public authorities may digitise records for a variety of reasons, some may also wish
to dispose of the originals after digitisation to save on storage and processing costs.
Disposing of the originals after digitisation and before the authorised retention period for
that class of record has expired, is referred to as ‘early destruction’. Authorisation from the
State Archivist is required for the early destruction of records after digitisation and the
Digitisation Disposal Policy sets out what records are eligible for authorisation and under
what conditions.
This section of the digitisation guidelines complements the policy statement by providing
advice on seeking authorisation for early destruction, including assessing whether the
records meet the policy criteria. Please ensure you are familiar with the Digitisation
Disposal Policy before reading this section.
5.2 Overview of process
As required by the policy, each public authority will need to seek authorisation from the
State Archivist. This is done in writing, proposing particular classes of public records, as
identified in an approved retention and disposal schedule, for early destruction. Depending
on the scope of digitisation process, this may be a few records classes or many.
The public authority has initial and ongoing responsibility for assessing whether the
classes of records meet the criteria proposed in the policy, and for certifying that
appropriate systems and procedures are in place for generating and managing the digital
image so that it is an accurate, reliable and authentic copy of the original.
Queensland State Archives is responsible for assessing applications in accordance with
the policy and providing authorisation in accordance with the Public Records Act 2002.
5.3 Determining whether records are eligible
In accordance with the parameters of the policy statement, public authorities should
assess whether the classes being proposed for early destruction meet all of the following
requirements:
• have a total retention period of ten years or less, in accordance with an approved
retention and disposal schedule
• are not subject to format specific retention requirements, and
• have a low risk of the original records being required for legal proceedings or similar
purposes.
Record classes being proposed for early destruction must meet all three requirements.
Retention period restriction
The total length of time a record is retained for comprises two components:
• the amount of time from creation of the record, or the file it belongs to, to the disposal
trigger,8 and

8
A ‘disposal trigger’ is the event or action, specified in a Retention and Disposal Schedule from which the
disposal date is calculated. Common disposal triggers include ‘after last action’, ‘after contract / agreement

18
Queensland State Archives: Guideline for the Digitisation of Paper Records

• the amount of time which the record has to be retained after a disposal trigger.
It is important to note that the ten-year retention period restriction on authorisation for early
disposal applies to this total period, not simply the number of years after the disposal
trigger occurs.
A retention and disposal schedule clearly indicates the disposal trigger and the period of
time a record needs to be retained after the trigger. However, to determine the total
retention period for the purpose of authorisation under the Digitisation Disposal Policy, it
will also be necessary to determine the average period that elapses before the disposal
trigger occurs.
Determining the total retention period may involve discussions with the relevant business
areas to determine how long records remain active before the disposal trigger is activated.
As many disposal classes use ‘after last action’ as a disposal trigger, it would be
necessary to determine for how long a file is usually active. For example:
• if files relating to business planning are routinely closed at the end of the financial year
in accordance with the planning cycle, and
• the retention period is five years after last action,
• then these records would be eligible for authorisation, as the total retention period
would be six years.
In contrast, if a record is usually active for three years, and the retention period is ten
years after last action, then this record would not be eligible for early destruction.

For example:
• An application was made for a licence, and approved.
• The disposal action for approved licence records is retain for five years after
expiration of licence.
• Licences are valid for two years.
The total retention period would be approximately seven years (assuming that it is
only a short time from receipt of application to approval of licence) and this class of
records would therefore be eligible for authorisation.

Queensland State Archives acknowledges that determining the total retention period for
some record classes will rest on the ‘balance of probabilities’, as particular records within a
record class may be retained for longer. In addition, some records change class during
their life span. For example, if some records within a class are subject to a Freedom of
Information request, a different record class also applies and their retention period is
potentially extended.
Please note: if it is difficult to apply a record class with any certainty to some types of
records at creation or early in the life of the record, then these records are not eligible for
early destruction.

expires’ or ‘after end of financial year’. As a disposal trigger may occur many years after the creation of a
record, for example, a trigger ‘after sale of building’ may occur more than 50 years after the creation of a
building purchase record.

19
Queensland State Archives: Guideline for the Digitisation of Paper Records

Queensland State Archives’ Appraisal Archivists can assist in determining the total
retention period and will review these determinations in assessing any application for
authorisation.
Risk assessment
Once it has been determined that the class of records meets the retention criteria, a risk
assessment should be undertaken which examines the likelihood of the records in each
class being proposed for early destruction, required for legal proceedings and challenged
in court. In undertaking this assessment, a review of the agency’s litigation history and
consultation with legal staff is essential.

For example:
• An agency may process 1000 claims files a year.
• Of these, 2% are subject to dispute and of this only 10% proceed to full litigation.
• In the agency’s experience of litigation, the validity of the records or parts of them
has not been challenged.
Therefore, there may be a low risk of the original records being required and the Chief
Executive may be happy to propose this record class for authorisation for early
destruction.

In addition to assessing the risk of the original record being required in legal proceedings,
there may be other factors to consider, such as whether it is a vital record or whether there
are any business needs for the original format. In many cases these other risks may be
appropriately treated and minimised through good digitisation procedures and appropriate
management of the image. Table 1 (below) includes examples of some risks and potential
actions to minimise risk.
Risk Mitigation / Treatment
Electronic copy not legible • Develop and implement quality
assurance procedures.
• Procedure for retention of original
and inclusion of metadata /
explanatory notes if poor quality
original means that a legible image
cannot be generated.
Loss of hand-written annotations on • Quality assurance procedures.
originals
• Raise awareness of staff that if they
make extensive annotations on a
print-out of an image, the annotated
copy should be rescanned as a new
record.
Electronic copy cannot be found / lost • Capture and management of image
in recordkeeping system.
• Regular backups and other system
maintenance.

20
Queensland State Archives: Guideline for the Digitisation of Paper Records

Unauthorised access to record • Use of access controls in


recordkeeping system.
Lack of capability to retrieve digitised image • Minimised through retention period
over time restriction in policy
• Development of agency migration
strategy.
Unauthorised manipulation of image • Only authorised staff undertake
scanning.
• Develop and implement procedures
specifying what manipulation is
acceptable.
• Security of image through
recordkeeping system.
Loss of vital records in a disaster • Off-site backup.
Table 1: Identifying and treating risk

Formal risk assessment processes should be used to identify risks and plan mitigation
strategies. These may be either agency-endorsed procedures or AS 4360: Risk
Management. Appendix 11 of the DIRKS (Designing and Implementing Recordkeeping
Systems) manual provides advice on how the AS4360 risk management process can be
adapted to recordkeeping risks.9
Format-specific retention requirements
Records which are subject to format-specific retention requirements which are not
overriden by the Electronic Transactions Act 2001 are not eligible for early destruction
after digitisation. Format-specific requirements relate to the need to retain a record in its
original, paper form and commonly relate to witnessed or signed documents.
Many format-specific requirements in legislation were overridden by the Electronic
Transactions Act 2001. However some requirements were specifically excluded from the
coverage of the Act and these exclusions are noted in Schedule 1 of the Act. Other
requirements may be found in regulations or standards, for example the Financial
Management Standard 1997 requires financial information to be kept in its original form for
one year after the date of the audit report for the financial year.
Other format-specific requirements may require electronic forms of documents to be
retained on a particular storage device. If this is the case, the systems and procedures for
digitisation (see next section) should comply with these requirements.
Consultation with legal staff should be undertaken to determine the extent of format-
specific requirements affecting the records of an agency.
5.4 Ensuring appropriate systems and procedures
Section 2.3 of the policy statement specifies a range of conditions public authorities must
meet before authorisation can be granted. Table 2 (below) provides references to advice
on meeting these requirements. Many of these requirements are discussed further in these
guidelines.

9
National Archives of Australia (2003) The DIRKS Manual: A Strategic Approach to Managing Business Information. Available online:
http://www.naa.gov.au/recordkeeping/dirks/dirksman/dirks.html.

21
Queensland State Archives: Guideline for the Digitisation of Paper Records

Requirement Further Advice / Information


Policies and procedures covering:
• Roles and responsibilities for the Section 3.2: Which records will be digitised?
selection, digitisation and management of Section 4.3: Procedures and standards
digitised records and secure and Public authorities may also need to review their
documented destruction of originals procedures for registering and sentencing records
to ensure that disposal decisions can be made at
the point of digitisation.
• Technical specifications for digitisation Section 6: Technical considerations
While these guidelines provide recommendations,
agencies are responsible for determining specific
requirements for the type of records they intend to
digitise.
• Capture of technical imaging metadata Section 7: Metadata
• Quality assurance procedures Section: 6: Technical considerations
Quality assurance procedures must cover the
calibration of equipment as well as the examination
of images captured.
Public authorities should also determine an
appropriate ‘cooling off’ period after digitisation and
before the destruction of the originals to allow for all
quality assurance processes to take place
(minimum 1 month).
System requirements See Information Standard 40: Recordkeeping for
general advice on the management of records.
• Designed with adequate physical and For an explanation of recordkeeping systems, see
other security safeguards to ensure the “Characteristics and functionality of recordkeeping
records remain inviolate and can only be systems” in State Records NSW DIRKS Manual.
changed in an authorised manner. For specific information on ensuring the security of
systems, see Information Standard 18: Information
Security and Best Practice Supplement.
• Appropriate metadata is retained,
including:
o Mandatory recordkeeping metadata National Archives of Australia, Recordkeeping
elements are captured and Metadata Standard for Commonwealth Agencies
maintained in accordance with the Information Standard 31: Retention and Disposal of
Recordkeeping Metadata Standard for Public Records principle 2 specifies what retention
Commonwealth Agencies, including and disposal information must be captured.
the allocation of retention and
disposal actions.
o Captures appropriate audit trails and See elements “Management History” and “Use
maintains these as recordkeeping History” in National Archives of Australia,
metadata Recordkeeping Metadata Standard for
Commonwealth Agencies.
o Captures technical imaging metadata Section 7: Metadata
at the point of digitisation

22
Queensland State Archives: Guideline for the Digitisation of Paper Records

• Is covered by business continuity and See Principle 9 in Information Standard 18:


disaster recovery plans Information Security and Best Practice
Supplement.
• Has a migration strategy to ensure that See Section 10: “Preserving Digital Records for the
public records are not placed at risk of Long Term” in National Archives of Australia Digital
loss through technological obsolescence. Recordkeeping Guidelines.
• That there is no regulation requiring For advice on identifying recordkeeping
the electronic form of the record to be requirements, Step C in State Records NSW
kept on a particular kind of data DIRKS Manual.
storage device as referred to in Section 8: Storage and media options.
section 20(2)(c) of the Electronic
Transactions Act 2001 OR that the
regulation has been complied with.
Table 2: Sources of advice on meeting requirements

5.5 Obtaining authorisation


The Chief Executive Officer of a public authority may write to the State Archivist to seek
authorisation once the public authority has assessed whether the classes of records being
proposed for authorisation meet the criteria, and that appropriate systems and procedures
are in place for generating and managing the digital image. This written request should
include form 1: a signed declaration of compliance with the policy, form 2, a list of the
record classes (as identified in an approved retention and disposal schedule) proposed for
early disposal, and the name of an appropriate contact officer within the public authority.
The forms are in appendix 1 of the Digitisation Disposal Policy.
Following receipt of this request, Queensland State Archives’ Appraisal Archivists will
assess the request against the policy. This may involve requesting copies of the risk
assessment or determinations of total retention period for some classes where there may
be doubts regarding their eligibility for authorisation.
The Appraisal Archivists will then liaise with the contact officer to:
• develop and insert the required classes in an existing public-authority specific retention
and disposal schedule, or
• develop a new schedule if the permissions relate to a comprehensive sector-specific
schedule.
If necessary, the authorisation can be limited to specific business units rather than
applying to the whole public authority. Table 3 shows examples of disposal classes
providing authorisation for the early destruction of public records.

23
Queensland State Archives: Guideline for the Digitisation of Paper Records

Where the records are covered by an agency-specific retention and disposal schedule:
Reference Description of records Status Disposal Action
Class Original licence records which Temporary Retain for 2 months after
number have been digitised digitisation and completion of
quality checks
Class Digitised images of licence records Temporary Retain for 5 years after last
number action

Where the records are covered by General Retention or Disposal Schedule or sector-
specific schedule:
Reference Description of records Status Disposal Action
Class Original records sentenced under Temporary Retain for 2 months after
number classes [insert class numbers and digitisation and completion of
summary description] of the quality checks
[name, number and version of
schedule], where full and accurate
digitised images are retained for
the authorised retention period.

Table 3: Examples of disposal classes

5.6 Monitoring and review


Public authorities are responsible for monitoring their recordkeeping practices to ensure
requirements are met. As part of this responsibility, public authorities should ensure that
digitisation policies and practices continue to meet the requirements of the Digitisation
Disposal Policy. In particular, any proposed change to digitisation procedures or
recordkeeping practices should be assessed to ensure the terms and conditions of the
authorisation continue to be met.
As with other disposal authorisations, it is recommended that they are reviewed every five
years, or sooner if one of the following occurs:
• Machinery of Government (MoG) change.
• Plan to extend or reduce the range of records proposed for early destruction.
• Any change affecting the risk assessment undertaken to gain authorisation (for
example, increase in litigation affecting a particular class or records).

24
Queensland State Archives: Guideline for the Digitisation of Paper Records

6: Technical considerations
Prior to implementing a digitisation program, there should be a high level of understanding
of the technical aspects of scanning within the organisation. Whether an organisation
outsources its digitisation or performs the work in house, familiarity with key technical
aspects of digitising will assist relevant staff to gain an understanding the process.
Most software and hardware that will be used in a digitisation program will provide a range
of variable parameters such as image resolution and output file format, and informed
choices need to be made on each of these. Establishing appropriate technical standards
for digitisation before implementation will promote consistency and accountability.
As detailed in section 3: Issues to consider before commencing digitisation, there are a
number of business considerations that need to be assessed by your organisations prior to
implementing a digitisation program and also when deciding which records will be
digitised. The key technical considerations of:
• resolution,
• bit depth,
• compression, and
• file format
are described in detail in the following sections with recommendations provided where
warranted. A summary table of recommendations is provided in appendix 3: Table of
technical recommendations. Also included in this chapter is a discussion of the quality
control procedures that can be put into place to check that the image files created meet
the specified standards.
6.1 Resolution
Picture elements, or pixels, can be considered the building blocks of all digital images.
They are square cells of a single colour or shade that, when arranged in a regular grid
pattern, form the digital image. The resolution of a digital image is the density of pixels
that make up the image. Pixels per inch (PPI) is used to describe image resolution.
Figure 1 shows a piece of text scanned at various resolutions.

Figure 1: 100PPI, 200PPI and 300PPI examples showing the effect of resolution on image clarity

For example, the image produced by scanning an A4 (8.27” x 11.69”) page at 100 PPI
would have 827 pixels in the horizontal by 1169 pixels in the vertical direction, or a total of
966,763 pixels. If the same A4 image were scanned at a resolution of 300 PPI, it would be
made up of 2481 x 3507 pixels or a total of 8,700,867 pixels (8.7 megapixels). Similarly, a
4” x 6” photograph digitised at 300 PPI would result in a 1200 x 1800 pixel image with a
total of 2.16 megapixels. As seen in these examples, the pixel density combined with the

25
Queensland State Archives: Guideline for the Digitisation of Paper Records

dimensions of the source material provides an What about dots per inch (DPI)?
accurate assessment of the total number of
pixels that will make up the resultant image. DPI is a measure of printing resolution, in
particular the number of individual dots of ink
Occasionally, an image will be described by a printer or toner can produce within a linear
using its pixel dimensions rather then pixel one-inch space.
density. For example, images intended only for Due to the similarity with other measurements
viewing on a computer screen may described of graphical resolution, the DPI measurement
as “800 x 600 pixels”, “1024 x 768 pixels”, etc. is frequently misused, for instance, to specify
a scanner's sampling resolution or the number
By determining the source material dimensions of pixels per inch in a computer display.
in inches and using the provided horizontal and
Using DPI measurement in these cases is
vertical pixel totals, the pixel density of the
generally considered to be inaccurate and
image can be discovered. For example, a misleading, though the intended meaning is
1024 x 768 image displayed full screen on a usually clear based on context. In these
17” monitor (viewing size 13” x 10”) has a cases, a measure given in DPI can be taken
resolution of approximately 80 PPI. as the number of pixels per inch.

Scanning hardware limitations


As the resolution of an image cannot be
increased beyond that at which it was originally
digitised without recapture, it is crucial that an
appropriate resolution is selected prior to
image capture. Even current model entry level
scanners targeted at home users can scan at a
resolution of 600 PPI, meeting or exceeding
the recommendations given in this guideline.
Many scanning devices currently available
have maximum resolutions of 4800 PPI.
Higher PPI settings will result in images which
are able to contain more detail per inch while
increasing the file size of the resultant image.
It should also be noted increasing resolution
beyond certain thresholds will not provide a Figure 2: A 10×10-pixel image on a computer display may
more useful image with existing viewing and require more than 10×10 printer dots to accurately
printing technology10. reproduce, due to limitations of available ink colours in
the printer
Adapted from: http://en.wikipedia.org/wiki/Dots_per_inch
Convenience
The time and effort required to locate a paper record, prepare it for scanning, and return it
to storage need to be considered when determining what resolution will be used. To
exploit future technology, such as high resolution viewing and printing capabilities not yet
commonly available, it may be warranted to set the resolution of the capture device at its
highest level for the best possible quality11, thus avoiding the need to rescan the paper
record. In this scenario, a lower resolution version of the image, which can be derived
from the high resolution image, may be more appropriate for contemporary use. Refer to
section 6.6: Master files and derivatives, for more information.

10
General Guidelines for Scanning. 1999. Colorado Digitization Project. Accessed March 2005 at
http://www.cdpheritage.org/resource/scanning/documents/std_scanning.pdf
11
Digital Imaging for Archival Preservation and Online Presentation: Best Practices. 2001. Michigan State University. Accessed March
2004 at http://www.historicalvoices.org/papers/image_digitization2.pdf

26
Queensland State Archives: Guideline for the Digitisation of Paper Records

Mode of use
How the digitised documents will be used needs to be considered when making a decision
about resolution. The resolution of the typical output should be considered. As a general
guide, source documents that are generally magnified for viewing and printing require
digitising at a higher resolution, while source documents that are reduced for viewing and
printing can be digitised at lower resolutions.
In the case of large documents, the intended viewing or reproduction size needs to be
considered, but there can be logistical and practical difficulties if using too high a resolution
for large documents. For example, digitisation of an A0-sized (33”x47”) poster at 300 PPI
could produce a file over 400Mb in size. While the storage of this sized file may be
accommodated, the processing power required to view and print such files is beyond many
systems.
For a large map or plan that is to be only ever viewed as an A4-sized image, the reduced
size of the output means that the input resolution may be quite low. However, a high
resolution may be required to legibly capture the fine line work and small text that is often
present on large format maps and plans. The resolution selected to digitise documents
may be a compromise between detail and file size.
Source documents that are typically enlarged for viewing or are of a small size and require
magnification for use should be digitised at high resolutions. The best illustration of this
would be the digitisation of a slide, microfilm or photographic negative which would
normally be viewed at several times its actual size. It is common practice for these types
of originals to be digitised at many thousands of pixels per inch to produce useable output
at viewing size.
On the other hand, considering that even the most modern computer monitors typically
have resolutions less than 100 PPI, if a document is digitised purely for on screen viewing
at the original scale, digitising at high resolutions will not provide any benefit12.

Recommended Resolutions
Table 4 shows the minimum recommended PPI resolutions for digitising paper records.
Document Type Page Size Resolution
Standard text documents Up to A3 200 PPI
Oversized documents, e.g. maps Larger than A3 200 PPI
6”x4” 600 PPI
Photographs 7”x5” 430 PPI
9”x6” 300 PPI
Table 4: Resolution recommendations

Digitising at a higher resolution than recommended may be necessary if there is a


requirement to enlarge the image for use or to capture highly detailed paper originals.
Public authorities should combine reference to these guidelines with their own testing
on typically digitised documents prior to selecting which resolutions to use.

12
Scanning Tips and Techniques. Jasc Software Inc. 1999. Accessed October 2004 at http://www.jasc.com/tutorials/scantip.asp

27
Queensland State Archives: Guideline for the Digitisation of Paper Records

6.2 Bit Depth


A “bit” is the fundamental unit of computer information having just two possible values,
either 0 or 1. Bit depth is the number of bits used to describe the colour of each pixel.
Greater bit depth allows a greater range of colours or shades of grey to be represented by
a pixel13. Using multiple bits increases choice and variety, at the expense of increased file
size. For example, using only 1-bit pixels gives 2 colours, usually either black or white.
Using 4 bits gives 16 colour14 choices (i.e. 2 x 2 x 2 x 2). Typical bit depths are described
below.
Bi-tonal (1-bit)
Bi-tonal images are made up of a foreground and a background colour, typically black in
the foreground and white as the background. Because this does not allow for shading, bi-
tonal depth is recommended primarily for black and white text documents without
illustrations, or with simple line drawings which have no shading.
Palettised (4-bit: 16 colours / greyscales and 8-bit: 256 colours / greyscales)
Images with these bit depths can be classified as palettised since each pixel in the image
is assigned a value that relates to a specific colour in the palette. Colour 4- and 8-bit
images should be used to capture colour drawings and illustrations. 8-bit is more
commonly used than 4-bit. Using a palettised image for continuous colour changes such
as those found in photographs will give poor results and is not appropriate.
Greyscale 8-bit images are most useful for black and white photographs, half-tone
illustrations, other types of continuous tone illustrations, handwritten and typed manuscript
and archival materials that are nominally black and white, but which actually contain
shading and varieties of ink density and paper tonality. For older documents, where for
example the paper may be coloured with age or ink may have faded, colour rather than
greyscale may be appropriate15.
High colour (16-bit: 65536 colours / greyscales)
For the most continuous greyscale images requiring more than 256 shades of grey, 16-bit
greyscale may be used. It provides these additional shades, which are used in such
applications as medical imaging, with increased file size.
16-bit colour is a compromise between palettised colour and true colour offered in the
higher bit-depths. 16-bit colour is used in many video and animation applications, but its
use is limited for still images. If used for continuous images such as photographs, some
colour changes may be noticed, and for discrete colour drawings, 16-bit could be
considered excessive and inefficient.
True colour (24-bit: 16.7 million colours and 48-bit: billions of colours)
When digitising full colour images and photographs, 24-bit images should be used. These
true colour images consist of three colour bands – red, green and blue (RGB) each of
which is 8-bit for 24-bit colour or 16-bit for 48-bit colour. Each pixel in a 24-bit colour
image will have a red value, a green value and a blue value, each between 0 and 255. For
example, a citrus green colour pixel would have an RGB value of 201,254,40.

13
Creating and Managing Digital Content – Glossary. 2002. Canadian Heritage Information Network. Accessed March 2005 at
http://www.chin.gc.ca/English/Digital_Content/Small_Museum/glossary.html#c
14
Standard 4-bit colours are black, dark red, dark green, dark yellow, dark blue, dark purple, dark cyan, pale grey, mid grey, red, green,
yellow, blue, magenta, cyan and white.
15
Technical Recommendations for Digital Imaging Projects. 1997. Image Quality Working Group of ArchivesCom. Accessed March
2005 at http://www.columbia.edu/acis/dl/imagespec.html

28
Queensland State Archives: Guideline for the Digitisation of Paper Records

True colour depths are recommended for any materials with colour where colour conveys
essential information. For colour photographs the minimum recommended bit depth is 24-
bit (true colour). With current viewing and printing equipment, 48-bit colour does not
provide any meaningful advantages over 24-bit colour. However, if documents are
captured using a 48-bit capture device now, the benefits may be able to be exploited in the
future as technology develops.
32-bit colour is 24-bit colour with an additional 8-bit channel providing 256 levels of
transparency and is used mainly for digital video and animation applications.
Selecting an appropriate bit-depth
The nature of the documents being digitised should be the main factor dictating the bit
depth used for the images produced. For the digitisation of black and white text
documents, bi-tonal colour depth will usually capture the information most efficiently.
However, for documents that contain greyscales or colours, a bi-tonal image will not
capture all of the information and may produce an illegible image. Palettised colour depth
is typically suitable for line drawings, colour document and diagrams, while continuous
tone images, such as photographs, are best captured in true colour.

Figure 3: Greyscale text captured in 24-bit colour showing that using a higher than recommended colour depth may introduce extra
colours into the image

Capturing a document at a lower than recommended bit depth will possibly result in an
image that is visibly different from the original record. In some situations this visible
difference and loss of information will be acceptable – for example when digitising a
document with black and white content, but a coloured letterhead, the loss of colour in the
letterhead may be acceptable. Choosing a higher than recommended colour depth, such
as 24-bit colour for a black and white document, will not provide any benefits, but will result
in an increase in the file size of the image produced and may even introduce small areas
of extra colours not present in the original document.
The conversion of a colour drawing, such as a simple business graphic, into a 24-bit colour
image would not only result in an inefficient file size but also introduce many extra colours
into the image. For example, the original document digitised to produce the image shown
in Figure 3 had three colours – black, white and grey. However, during the process of
scanning this document as a 24-bit image, 17,898 colours including pixels with shades of
brown and pink were generated! If the image was printed using a monochrome printer, the
general appearance may be similar to the original, however, the introduction of additional
colours may affect post-digitisation image processing operations. In this case using 4- or
8-bit grey for the output image would be more appropriate.

29
Queensland State Archives: Guideline for the Digitisation of Paper Records

As is the case when determining the resolution to use, the mode of use of the digital
images should be considered when deciding upon an appropriate bit-depth. If imaged
pages will most often be viewed on computer screens, then the use of a higher than
normal bit-depth may be warranted. As seen in Figure 4, increasing the colour depth may
enhance the on-screen readability of a low resolution image. If, however, digitised copies
of records will only ever be made available as monochrome print outs, then the use of
colour could be considered superfluous.

Figure 4: 8-bit and 1-bit versions of a 72ppi image, showing how an increase in bit depth allows anti-aliasing which may improve
readability
Capturing a document that contains a watermark, highlighting, or hand written annotations
into a bi-tonal image may cause text to be obscured leading to a loss of information. An
example of this information loss is shown in Figure 5. Once again, a palettised grey or
palettised colour output image would capture the text of the document as well as the extra
information in the watermark or annotations.
Halftones
In printing, halftones are evenly spaced spots of varying diameter to produce apparent
shades of grey with a single colour ink. The darker the shade at a particular point in the
image, the larger the corresponding spot in the printed halftone. In traditional publishing,
halftones are created by photographing an image through a screen. In order to simulate
variable-sized halftone dots in digital imaging, dithering is used, which creates clusters of
pixels in a "halftone cell". The more black pixels in the “cell”, the darker the grey.
Bi-tonal images utilising halftones may be considered as an alternative to using 4- or 8-bit
grey to represent greyscales on digitised documents. This technique may provide some
advantages over using palettised images including wider format compatibility and reduced
file size.

Figure 5: 8-bit vs. 1-bit vs. 1-bit with halftone for watermarked documents

However, use of halftones may also introduce a speckled effect to areas of the image that
should be white. At too low a resolution, halftones will not be beneficial, and halftones at
high resolutions may produce a large number of halftone pixels where there should be
white space. Some other image processing, notably optical character recognition, (refer to
section 6.6: Master files and derivatives for more information) may also be negatively
affected if using halftones in text documents. Public authorities considering using

30
Queensland State Archives: Guideline for the Digitisation of Paper Records

halftones for digitised records should carry out thorough testing to ensure the end results
are suitable.
When paper documents that contain halftone images are digitised, a distracting pattern of
lines called "Moire" is often produced. To avoid this unwanted effect, most scanning
systems have a “de-screen” function to remove the Moire during the scanning process.
Post-capture image processing software can also be used to correct these images.
Alternatively, halftones may be captured by scanning the source document at a high
enough resolution to isolate each of the dots making up the halftone, typically 600 PPI or
above, and then using software to reduce the image to the standard resolution16.

Recommended Bit Depths

The table below shows the recommended bit depth for digitising paper records.
Document type Bit Depth
Black and white text only 1-bit bi-tonal
Text with some colour 8-bit colour
Text with shades of grey 8-bit grey
Colour drawings / presentations / graphics 8-bit colour
Black and white photographs 8-bit grey
Colour photographs 24-bit colour
Table 5: Bit depth recommendations

If a document containing a mix of the above is being imaged, the highest colour depth
should be used to capture it. For example, an otherwise black and white page which
includes a colour photograph should be captured in 24-bit colour.
Public authorities should combine reference to these guidelines with their own testing
on typically digitised documents prior to selecting which bit depths to use.

6.3 Compression and File Size


Calculating file size
As indicated in the previous sections, the total number of pixels used to make up an image
affects file size. Additionally, the colour depth of each of those pixels has a multiplying
effect on the file size. In the example used earlier, an A4 page was digitised at 300 PPI
giving a total of 8 700 867 pixels. The following table shows the number of bits that make
up this image at varying colour depths and resolutions, and shows approximate file sizes17.
Using the information in Table 6, it can be seen that if an organisation were to scan all of
their A4 sized documents as 600 PPI 24-bit colour images, even a moderate collection of
digitised documents would create large file storage requirements. In addition to choosing
appropriate image resolution and colour depth, a number of compression methods can be
adopted to reduce the file size of digital images.

16
How To Fix Bad Scans. 2004. Dixie State College of Utah. Accessed March 2005 at http://cit.dixie.edu/vt/vt2600/bad_scans.asp
17
1 byte contains 8 bits. 1024 bytes = 1Kb. 1024Kb = 1Mb

31
Queensland State Archives: Guideline for the Digitisation of Paper Records

Colour depth Resolution Total bits Uncompressed


(PPI) file size (Mb)
1 bit bi-tonal 300 8 700 867 1.04
1 bit bi-tonal 600 34 803 468 4.15
8 bit grey or colour 300 69 606 936 8.30
8 bit grey or colour 600 278,427,744 34.00
24 bit colour 300 208 820 808 24.89
24 bit colour 600 835,283,232 101.96

Table 6: Uncompressed file sizes for an A4 page digitised at different pixel depths and resolutions

Compression
Compression reduces storage space requirements, saves on backup and transfer media,
lessens the impact on the network of accessing image files and provides shorter file
transfer times. Mainstream compression techniques in widespread use today are tried and
tested and can be used with the confidence that images will continue to be accessible
once compressed. Compression used for images can be categorised into lossless and
lossy compression.
Lossless compressions reduce the size of a file without discarding any information. An
example of a lossless compression technique is substitution. As a very simplistic example,
if the A4 page of text described in Table 6 consists of 90% white space and 10% black
text, then by simply substituting a 4-bit symbol for each white pixel’s 24-bit RGB value, the
image size would reduce from approximately 25 Mb to around 6Mb. The substitution table
is stored within the image file, allowing the exact image to be viewed and printed, while still
having a small file size.
Lossy compressions, however, are irreversible; file information is lost when a lossy
compression process is applied. When the file is viewed or printed, the resultant image
will therefore be different from the original. The degree of difference between the original
and compressed files is sometimes related to the amount of compression required.
Appropriately applied, the human eye should not be able to readily differentiate between
the original file and the compressed version.
One of the most commonly used lossy compression processes is known as quantisation.
Colour values are simplified and rounded - discarding real information. The extent of
compression is variable with the level of output quality specified governing how much
simplification occurs. Greater simplification leads to a smaller file size, but with greater
loss of information.
The effects of file compression can depend on the file format, the file contents and the
compression method used. There is not a fixed file size reduction that can be expected
from every image that is compressed. For example, the commonly used JPEG
compression works well on colour photographic images, but poorly compresses images
containing drawings, letters or simple graphics. Therefore, if compression is to be applied,
a method appropriate to the digital image and its intended use needs to be selected.

32
Queensland State Archives: Guideline for the Digitisation of Paper Records

Recommended Compression
Some form of compression should be applied to digitised records to enable storage and
access in an efficient manner.
Lossless compression provides file size reduction while being able to reproduce an
exact, true and accurate digital copy of the image created at time of digitisation. Where
possible, lossless compression should be employed.
Lossy compression is not suitable when original paper records are authorised for early
disposal as the accuracy of the image may be called into question. However, when
originals are being retained, the additional file size reduction that lossy compression
provides can mean that a small, perhaps indistinguishable, loss of data may be
acceptable for some file types. When employing lossy compression techniques, the
resulting image should not appear noticeably different from the original paper record.

6.4 File Formats


File formats encode information into a form which is intended for processing and use by
specific combinations of hardware and software18. Fortunately, the current technology
trends of interoperability and compatibility have led to many file formats being supported
on a variety of hardware and software platforms. This trend applies to image file formats
with many image processing and viewing programs available for Windows, UNIX, and
Apple computer systems. Many free or relatively inexpensive image manipulation
programs support the creation, editing, viewing and printing of images in dozens of
different formats.
The five file formats most commonly used for digitisation are described below and
summarised in Table 7.
Joint Photographic Experts Group (JPEG) File Interchange Format (JFIF)
JFIF was developed by the Joint Photographic Expert Group and was standardised by the
International Standards Organisation (ISO) in August 1990. The JFIF format is platform
independent and can be exchanged between a variety of different applications.
This format of images is commonly used on the World Wide Web (WWW) and in digital
photographic equipment as the JPEG compression scheme inherent to JFIF lends best to
photographs and complex graphics with continuous tones to minimise file sizes19. The
JFIF format uses the JPEG lossy compression technique which, by simplifying certain
colour information, can reduce file size at the expense of some image quality. The level of
compression is adjustable by the operator at the time of creation, allowing a compromise
to be found between the amount of information loss and the file size reduction. JFIF
supports only 8-bit grey and 24-bit colour, not 1-bit bi-tonal. This format will be referred to
by its common name, JPEG, through the remainder of this document.
A new related format, JPEG2000, has recently been developed which is substantially
advanced with new compression techniques, additional bit depths and a lossless
compression option20. While JPEG2000 is related to the original format, it is not

18
Brown A. Digital Preservation Guidance Note 1: Selecting File Formats for Long-Term Preservation. 2003. National Archives (UK).
Accessed March 2005 at http://www.nationalarchives.gov.uk/preservation/advice/pdf/selecting_file_formats.pdf
19 nd
Horton S. Web Style Guide 2 Edition: JPEG Graphics. 2004. Lynch and Horton. Accessed March 2005 at
http://www.webstyleguide.com/graphics/jpegs.html
20
Mendham S. JPEG 2000. 2005. IDG Communications. Accessed March 2005 at
http://www.pcworld.idg.com.au/index.php/id;1170029196;fp;2;fpid;1585691688

33
Queensland State Archives: Guideline for the Digitisation of Paper Records

compatible with many mainstream image processing, scanning, and viewing programs or
web browsers. These compatibility issues may be alleviated once the format is more
established in the sector and software and hardware vendors have assessed the format
and the market’s demand for its use.
Tagged Image File Format (TIFF)
The TIFF format was developed in 1986 by Microsoft and Aldus and is currently
maintained by Adobe21. Despite being an older file format TIFF is widely supported and is
seen by many as a de facto standard for image files.
TIFF files are commonly used in desktop publishing, faxing, 3-D applications and medical
imaging applications. There are several sub-formats within the TIFF specification. TIFF
CCITT22 Group 3 and Group 4 are the most widely used format in document imaging –
most fax transmissions are in TIFF Group 3 format. Other sub formats of TIFF support
greyscale, colour depths of up to 64-bit and offer compression choices including
uncompressed, lossless LZW, and run length compression23.
The most recent release, TIFF 6.0, was launched in 1992. While the baseline version of
TIFF 6.0 is fully compatible with applications designed to read earlier TIFF images, a
number of additional features were added that require software to be specifically tailored to
support the newer version. JPEG compression was included in the TIFF 6.0
specifications, and despite a technical revision in 1995 to overcome serious design flaws24,
there still remain problems with the use of this lossy compression within TIFF files, and this
option is not widely used. The TIFF version 7.0 specification which appeared in draft
format in 1997 but is still to be released is expected to feature a more stable
implementation of JPEG compression amongst other new features.
Various extensions to the TIFF specification have been implemented for specialised
purposes. Care should be taken when using these extended versions of TIFF, as the
application support for viewing and manipulating them may be limited.
Graphics Interchange Format (GIF)
Graphics Interchange Format (GIF) is a widely used image format introduced in 1987 by
CompuServe. In the early years of the WWW, developers adopted GIF for its efficiency
and widespread familiarity. A large proportion of the images on the Web are presented in
GIF format, and virtually all Web browsers that support graphics can display GIF files.
The GIF format supports a maximum 256 palettised colours or shades of grey so is most
suited to discrete images such as illustrations, black and white images, logos and line
drawings rather than photographs. GIF files are compressed using a lossless
compression technique, LZW. Although GIF has a free and open specification, the Unisys
Corporation patents LZW and its commercial use may require licensing and royalty
payments25. While the generation and use of GIF files can generally be done without
requiring a licence, and many of the patents that relate to GIF have expired, or are soon to
expire, the royalty free PNG format (outlined below) which was developed largely because
of this patent issue has taken over from GIF in many applications.

21
TIFF Revision 6.0. 1992. Adobe Systems Inc. Accessed March 2005 at http://partners.adobe.com/asn/developer/pdfs/tn/TIFF6.pdf.
22
Comite Consultatif International Telegraphique et Telephonique (International Telegraph and Telephone Consultative Committee)
23
Leurs L. The TIFF file format. 2001 Laurens Leurs. Accessed March 2005 at http://www.prepressure.com/formats/tiff/fileformat.htm
24
JPEG Image Coding Standard. 1998. Centre for Telecommunications and Information Engineering, Monash University. Accessed
March 2005 at http://www.ctie.monash.edu.au/EMERGE/multimedia/JPEG/COMM03.HTM
25
LZW Patent Information. 2005. Unisys Corporation. Accessed March 2005 at http://www.unisys.com/about__unisys/lzw/

34
Queensland State Archives: Guideline for the Digitisation of Paper Records

Portable Network Graphics (PNG)


Portable Network Graphics (PNG) is a lossless, portable, well-compressed storage format
for images. The open-source and patent free PNG format was designed to replace the
proprietary GIF format and, to some extent, the much more complex TIFF format. The
second edition of PNG is an ISO standard - ISO/IEC 15948:2003 (E)26. The PNG format
was designed specifically for use in online viewing applications such as the WWW, and the
format offers a range of attractive features that should eventually make PNG the most
common graphic format27. At this stage, however, the lack of universal support for PNG in
scanning and imaging applications is the format’s main weakness.
Portable Document Format (PDF)
Portable Document Format (PDF) is a widely used proprietary file format developed and
maintained by Adobe Systems. The PDF format was released in 1993 and is based on
the Adobe Postscript printing language. The PDF format was created by Adobe to provide
a standard storing and editing documents.
It is important to note that PDF is an encapsulating format, not a raster image format as
are the other formats in this section. Encapsulation formats attempt to ensure that files are
displayed and used consistently across computer programs and platforms28. As such,
PDF can not be used to capture digitised records directly, although some scanning
systems may convert to PDF seamlessly from a native image format without further user
interaction.
While the PDF format is widely used, because a PDF file is not an image, most image
processing software titles can not open a PDF file. Operations including the deriving of
related files through resolution reduction, optical character recognition, and image
correction will not usually be able to be carried out on a PDF file. If the original images are
not retained separately, any of these image processing operations should be performed
prior to encapsulated into a PDF file.
The strength of the PDF format is its ability to capture text and images in their original
format, preserving fonts, graphics and layouts. Viewing PDF files requires Adobe Reader,
a free application distributed by Adobe Systems29 or other licensed software which
embeds this capability. Because documents in PDF format can easily be seen and printed
by users on a variety of computer and platform types, they are very common on the
WWW. The PDF format supports several compression methods, most commonly used
are the CCITT, LZW and ZIP lossless compression schemes and JPEG lossy
compression.
It supports up to 8-bit palettised, 16 bit grey and 48 bit-colour and uses lossless
compression, based upon the Zip compression techniques used in PKZip and WinZip.
Summary
As seen in Table 7, each format has its strengths and weaknesses. Identifying the best
format requires knowledge of how the digitised documents will be used, the characteristics
of the materials that will be digitised and the way digitised documents will be delivered.

26
PNG (Portable Network Graphics). 2004. World Wide Web Consortium. Accessed March 2005 at http://www.w3.org/Graphics/PNG/
27 nd
Horton S. Web Style Guide 2 Edition: PNG Graphics. 2004. Lynch and Horton. Accessed March 2004 at
http://www.webstyleguide.com/graphics/pngs.html
28
File Formats and Compression. 2004. Technical Advisory Service for Images. Accessed March 2005 at
http://www.tasi.ac.uk/advice/creating/fformat.html#ff2
29
Adobe PDF. 2005. Adobe Systems Inc. Accessed March 2005 at http://www.adobe.com/products/acrobat/adobepdf.html

35
Queensland State Archives: Guideline for the Digitisation of Paper Records

The five formats described in this guideline are widely used for a number of digital imaging
applications.
JPEG and PNG are non-proprietary formats while TIFF and PDF are proprietary formats
which have freely available specifications. This provides system developers with a cost
effective and readily available means to incorporate support for these formats. If a system
was adopted which required the use of another format, especially a system-specific
proprietary format, the ability of that system to convert images into at least one of the
common formats discussed here is paramount.
As an encapsulation format, it is difficult to provide a direct comparison between PDF and
the image formats described here. Scanning or imaging systems that provide PDF as an
output format option will always need to use a true image format for an intermediary,
usually temporary file, that is converted into PDF. Therefore, the characteristics of this
intermediary file need to be considered prior to implementation.
To take advantage of the characteristics of different file formats, multiple copies of a
digitised document may be stored. Refer to section 6.6: Master files and derivatives,
where derived files that may form part of a digitisation program are discussed in more
detail, along with suggested file formats for the different types of derived files.
Considerations for digitising multi-page documents
TIFF is the only image format described here that is able to capture more than one image
in a single file. This enables storage of individually scanned pages of a multi-page
document into a single file.
The other image formats described here can only deliver a single image per file. If these
formats are used for digitisation of multi-page documents, image management software or
other systems, such as an eDRMS, are required to provide the linkage and sequencing
required to represent multiple images making up the pages of digitised document as a
single entity.
Some systems, including widely used eDRMS and high volume scanning software, require
that a multi-page format, typically TIFF or PDF, be used for image storage so that a single
file is capable of representing several scanned pages of a document. This should be
viewed as a software limitation rather than best practice, and if organisations are forced to
use TIFF or PDF solely due to their support for multiple paged documents, they should do
so with caution after thoroughly investigating all options.
While TIFF is widely regarded as the standard file format to use when capturing
documents as bi-tonal images, it lacks the compression and bit depth combinations to suit
other document types, particularly greyscale and colour documents. If no compression or
the inefficient packbits compression is used to capture multiple page greyscale or colour
documents as a single TIFF file, file sizes can become very large, affecting the
accessibility and storage of the file.
To overcome this file size issue, some vendors have chosen to implement JPEG
compression within the TIFF file format, providing a higher rate of compression, but with
the data loss inherent of this compression scheme. As mentioned previously in this
section, there is no agreed standard for the implementation of JPEG compression within
the TIFF format. Using non-standard formats for the storage of digitised records may
create compatibility problems with other software, perhaps preventing the images from
being viewed or printed.

36
Queensland State Archives: Guideline for the Digitisation of Paper Records

Name and TIFF 6.0 GIF 89a JPEG JFIF PNG 1.2 PDF 1.4
Current Version
Extension .tif .gif .jpg .png .pdf
Bit-depth(s) 1-bit bi-tonal; 1-8 bit bi-tonal, 8-bit greyscale 1/2/4/8-bit palette 4- or 8-bit
4- or 8-bit greyscale, or 24-bit colour colour or greyscale or
greyscale or colour greyscale palette colour
palette colour 16-bit greyscale, Up to 64-bit
Up to 64-bit 24/48-bit true colour support
colour colour
Compression Uncompressed Lossless: LZW Lossy: JPEG Lossless: Uncompressed
Lossless: CCIT Deflate, an LZ77 Lossless: CCIT,
G3/G4, LZW, derivative LZW. JBIG
Packbits, JPEG Lossy: JPEG

Standard/ De facto De facto JPEG: ISO ISO 15948 ISO 15930-


Proprietary standard, LZW standard, LZW 10918-1/2 1:2001. De facto
compression may compression JFIF: de facto standard
require licence may require standard
licence
Web Support Plug-in or Native since Native since Native since Plug-in or
external Microsoft® Microsoft® Microsoft® external
application Internet Internet Explorer Internet Explorer application
Explorer 3 2 4
Metadata Basic set of Free-text Free-text Basic set of Basic set of
Support labelled tags comment field comment field labelled tags plus labelled tags
user-defined
tags.
Strengths - Long history and - Widespread - Widespread - Native support - Widely used
widespread support use on websites in web browsers - Viewer freely
support as the through web - Native support - Royalty free available
premier format for browsers and in nearly all - Supports all - It can be used to
high quality digital imaging imaging commonly used distribute
images and faxes applications applications and bit depths documents with
- Lossless - Lossless web browsers images and text
- Lossless
compression compression - Variable that will print
compression
- Multiple pages compression consistently.
rates - Multiple pages
Limitations - Limited options - Not suitable - Used for - Limited uptake - Not an image
for compression for continuous photographs or - Not supported format
of colour and tone images or images that have within all imaging - Requires post-
greyscale images photographs continuous tone, and scanning processing for
- Not natively due to 8-bit rather than text applications encapsulation
supported by maximum documents or
- Not natively
browsers colour depth. graphics.
supported by
- LZW licensing - LZW licensing - Only lossy browsers
restrictions restrictions compression
- Limited support
- JPEG version - Not suitable for for file creation in
not widely used bi-tonal non-proprietary
applications
Specification http://partners.adob http://www.w3.or http://www.w3.org/ http://www.w3.org/ http://partners.adob
e.com/public/develo g/Graphics/GIF/s Graphics/JPEG/ TR/PNG/ e.com/public/develo
per/tiff/index.html pec-gif89a.txt per/pdf/index_refer
ence.html

Table 7: Image formats compared: Adapted from http://www.library.cornell.edu/preservation/tutorial/presentation/table7-1.html

37
Queensland State Archives: Guideline for the Digitisation of Paper Records

TIFF has many sub-formats that have been developed from the original specification,
baseline TIFF. It is not uncommon for software to be branded as capable of viewing TIFF
files, but actually only be able to view baseline TIFF. As a “baseline TIFF reader is not
required to read any images beyond the first one”30 many image viewing applications are
unable to view multi-page TIFF files, and instead show only the first page.
Public authorities who decide to use any non-standard image format should investigate
carefully the impact this may have on the longevity of the digitised records. Migrating non-
standard images to standard formats will require additional planning and resources. In
addition public authorities risk being locked into using products from the same vendor and
also risk the continued accessibility of their imaged records if the vendor leaves the
marketplace.
PDF may be considered to be an alternative to TIFF for the storage of multi-page files in a
single document. However as PDF is an encapsulation format if images are encapsulated
into a PDF file they cannot be manipulated, easily extracted, or have other image files
derived from them. As an alternative means of accessing images, PDF is an appropriate
format. However, careful consideration should be given to the likely need to manipulate,
extract, or derive information from the original image before retaining only a PDF version
of a digitised record.

Recommended File Formats


The table below lists the file formats recommended for digitising different
document types
Document type File Format
Black and white text only TIFF G3 / G4
PNG Bi-tonal
Text with some colour GIF colour
PNG 4- or 8-bit colour
TIFF (LZW)
Text with shades of grey GIF grey
PNG 4- or 8-bit grey
TIFF (LZW)
Colour drawings / GIF
presentations / graphics PNG 4- or 8-bit
TIFF (LZW)
Black and white photographs JPEG 8-bit grey
PNG 8- or 16-bit grey
TIFF (JPEG)
Colour photographs JPEG 24-bit (high quality compression 10:1)
PNG 24-bit
TIFF (high quality JPEG compression 10:1)
Table 8: File format recommendations

Public authorities should combine reference to these guidelines with their own
testing and refer to the specification of the systems used to manage imaged
record prior to determining which file formats are compatible and most suited for
their use.

30
TIFF Revision 6.0. 1992. Adobe Systems Inc. Accessed March 2005 at http://partners.adobe.com/asn/developer/pdfs/tn/TIFF6.pdf

38
Queensland State Archives: Guideline for the Digitisation of Paper Records

6.5 Quality Control


Decisions relating to the technical aspects of the digitisation program described in the
preceding sections should be documented and subject to quality control measures.
Without this, there is no assurance that digitised records will meet stated technical
specifications or fulfil their intended use.
Appropriate quality assurance procedures and guidelines are needed to ensure that
digitised records meet the requirements for their use. All stakeholders in a digitisation
project – sponsor, staff, users, and IT support – should be consulted to determine
appropriate quality measures31.
Quality measures should be agreed on and tested before image capture commences, to
ensure that they can be implemented and produce acceptable results32. Periodical
revision of the quality measures should occur to ensure they remain relevant to the project
goals and keep up with technology, legislation, and industry trends.
Quality Baselines
Baselines for acceptable and unacceptable characteristics need to be established for
digitised records so that a consistent level of quality can be maintained. These may be
general, perhaps simply requiring that each digital file be visually compared to the original
paper record, or complex, involving quantitative analysis of digital images using computer
equipment to ensure that the properties of a digital file meet accepted international
standards.
The complexity and detail of quality baselines will depend on the project’s aims and the
nature of records involved. Strict and detailed quality control should be applied to digital
images if the project intends to dispose of the original paper records, convert an important
collection for long-term access, build or be compatible with a digital archive initiative, or
make high quality reproductions of a paper document. However, quality tolerances need
to be set at a level that is achievable with the staffing, time, equipment and technology
resources available. For example, if staff are required to spend hours every day fine-
tuning a scanner to achieve quality baseline standards, then perhaps the equipment or the
baselines need to be changed.
Quality baselines should be established for the output device that a digital image is
intended for and be verified using that device33. If a digital image is intended for printing,
then the digital file should be printed and then checked against quality baselines for printed
images. If a digital image is intended for display on a computer monitor, quality baselines
should be verified on a computer monitor.
Calibration
Digitisation projects rely on correctly calibrated hardware and software to produce high
quality images that meet quality baselines. Ideally, calibration parameters should be
recorded, automatically if possible, at the beginning and end of each batch of documents
that are digitised. If the calibration parameters are outside of predefined bounds, then
remedial action should take place with a calibration process repeated until the parameters
are within limits.

31
Quality Assurance. 2004. Technical Advisory Service for Images. Accessed March 2005 at
http://www.tasi.ac.uk/advice/creating/quality.html
32
Moving Theory into Practice: Digital Imaging Tutorial. 2003. Cornell University Library/Research Department. Accessed March 2005
at http://www.library.cornell.edu/preservation/tutorial/quality/quality-01.html
33
Frey F. Guides to Quality in Visual Resource Imaging: 4. Measuring Quality Of Digital Masters. 2000. Council on Library and
Information Resources. Accessed March 2005 at http://www.rlg.org/visguides/visguide4.html#4.1

39
Queensland State Archives: Guideline for the Digitisation of Paper Records

The calibration settings for some equipment may need to be checked and recorded at the
beginning and end of a day’s digitisation work. Other equipment may not have any
calibration settings that are user-adjustable, and may only need calibration following
servicing or maintenance. Exact parameters and suggested intervals for calibration should
be determined with input from the hardware and software suppliers, and should be
documented with other quality controls.
To establish acceptable levels of quality for digital image capture, the scanning hardware
system should be tested by the use of scanner test targets or charts such as those shown
in Figure 6. These can contain a wide range of material which provide the ability to judge
output in carefully measured increments for such aspects as resolution, text, fonts, line
widths, colour, tonal range, handwriting, and halftone.

Figure 6: Standard “targets” can be used to test the functionality of digitising equipment

Environment
A controlled environment is required to consistently apply quality baselines. In an
uncontrolled environment, for example with excessive glare, reflections or using an
improperly set up computer system, a high quality image may be incorrectly deemed to
have not met quality baselines34. While calibration of hardware and software is one part of
ensuring a controlled environment that is necessary to evaluate digital images, viewing
conditions also need to be considered, as the optimal level for viewing a computer monitor
is in lower light conditions than for a paper based record. The size of the viewing screen,
plus the speed, processing power and memory of the computer need to be considered to
enable the retrieval and manipulation of large image files.
Workstation monitors used for scanning or quality control should be set at appropriate
colour depth, gamma and colour temperature settings and a high refresh rate to avoid a
flickering display. These settings will need to be set for each workstation, and also within
any image manipulation software where the option to adjust settings is available.
A monitor adjustment target, such as one shown in Figure 7, can be displayed on screen
when brightness and contrast adjustments are made, so that all the relevant shades and
steps in the target are distinguishable from the adjacent similar shades.
User perceived image quality will depend on the capabilities of display hardware being
used, the screen size and pixel dimension capabilities. The common pixel dimensions
supported by monitors are from a low 640 x 480 to a high of 1600 x 1200, referring to the
number of horizontal and vertical pixels on the screen for an image.
What area of an image can be seen on a monitor depends on the image pixel dimensions
and the desktop resolution. The area of an image displayed can be increased by
increasing the screen resolution or by decreasing the image resolution. When viewing
digitised text documents, typical checks can be made by examining the image at actual

34
Moving Theory into Practice: Digital Imaging Tutorial. 2003. Cornell University Library/Research Department. Accessed March 2005
at http://www.library.cornell.edu/preservation/tutorial/quality/quality-02.html

40
Queensland State Archives: Guideline for the Digitisation of Paper Records

size35. However, it may be necessary to enlarge other types of digitised records such as
photographs and maps to ensure details have been captured appropriately. As the
number of pixels displayed increases, more of the image area can be viewed, but without
also increasing the size of the monitor, details may be too small to see without zooming or
magnifying.

Figure 7: A monitor adjustment target can be used to ensure shades of grey can be distinguished

Scope of Quality Assessment


An important aspect of quality control is determining which characteristics will be subject to
inspection and the proportion of digitised images that will be checked. In addition to
comparing a digitised record to the original, the condition and features of the original may
need to be checked prior to digitisation. Almost all of the individual steps involved in
converting a paper record into a readable and accessible digital image can have quality
assessments placed against them. For example, following the capture of the paper record
as an image, the descriptive information or metadata should be checked for completeness
and correctness, the computer media that the images are stored on can be periodically
checked for readability and any computer security measures put into place to control
access to digitised records can be tested. These types of checks would be common to
many records managed by the organisation, not just digital images, and should be
integrated in standard records management processes.
All digital images can be tested, or a representative sample of digitised documents may be
selected. Testing of all digital images will ensure that all images meet the minimum
required quality levels, but can be very time and resource intensive. If, however, only a
sample is tested, care must be taken to ensure that the sample is representative of the
range of records digitised.
Many organisations and scanning bureaux routinely check a 10% sample of scanned
images, chosen randomly. In some cases, such as following equipment repairs, or if using
new staff or outsourcing vendor, each image is checked until there is confidence that the
standard is being met.36 However, testing only a sample of digital images gives a lower
degree of certainty that all images have met quality baselines.
Qualitative and Quantitative Assessment
The quality of images may also be evaluated using software to examine technical aspects
of images. For example, noise in images is caused by random pixel fluctuations, and may

35
A 21” CRT monitor with a display resolution of 1280 x 1024 is able to display an A4 sized page at 1:1 scale.
36
Western States Digital Imaging Best Practices Version 1.0. 2003. Western States Digital Standards Group. Accessed March 2005 at
http://www.cdpheritage.org/resource/scanning/documents/WSDIBP_v1.pdf

41
Queensland State Archives: Guideline for the Digitisation of Paper Records

make images appear grainy37. Software can be used to measure the level of noise in
images, to check that it is minimised to an acceptable level.
Measuring technical aspects gives a consistent and repeatable measure of image quality.
For instance, if noise in an image is measured twice, the same levels should be detected.
However, software tools to measure all aspects of quality and image appearance may not
be widely used or available. Instead, some quality control relies on human judgement.
Human judgement is often subjective and therefore results of visual inspections may vary
from person to person. Ideally if a number of staff are responsible for visual inspections,
training should be provided to communicate qualitative information effectively38.

Recommended Quality Checks

Digital images should be inspected and checked against the attributes listed below, in
addition to standard records management quality checks.
Has a true and accurate copy been made of the original record? Attributes to check
include:
• image size
• image resolution
• bit depth: bitonal, greyscale or appropriate colour depth
• too light or too dark
• too low or too high contrast
• lack of sharpness
• too much sharpening, unnatural appearance and halos around dark edges
• image orientation
• skewed or not centred
• images cropped or incomplete
• missing pixels or scan lines
• poor quality dithering
• obvious use of lossy compression
Has the image file been stored correctly? Issues to check include:
• file format
• file size
• incomplete or incorrect profile information / metadata
• appropriate security applied
• necessary derivatives produced
Have procedures for the disposal of the original paper record been followed?
Has the equipment been calibrated correctly?

6.6 Master Files and Derivatives


The process of initial capture of a paper record as an image is often a very resource and
time intensive process. The majority of this time and effort could be spent on such tasks

37
Sharma A. Digital Noise, Film Grain. 2001. Digital Photo Techniques. Accessed March 2005 at
http://www.phototechmag.com/sample/sharma.pdf
38
Moving Theory into Practice: Digital Imaging Tutorial. 2003. Cornell University Library/Research Department. Accessed March 2005
at http://www.library.cornell.edu/preservation/tutorial/quality/quality-02.html

42
Queensland State Archives: Guideline for the Digitisation of Paper Records

as locating the paper file, obtaining it from storage, separating the pages, and
reassembling and storing the paper file after capture. The time taken to actually scan the
record to the highest quality that the device allows and save into a lossless compression
format may pale in significance when compared to the time taken to handle the paper.
Usually, if digitisation is undertaken at a high level of quality with good quality control
measures, the original paper file may not need to be routinely accessed.
A single file format may not be appropriate for all intended uses of a digitised document.
Consequently, a master file can be complemented by a number of derivatives to meet
business and user needs.
As outlined in section 6.4: File formats, different file formats have different characteristics
and applications. TIFF files, although capturing an image without compromise, may not be
suited for use on the internet. GIF and PNG files are both commonly used on the internet,
but may not be suitable for high-quality printing of certain types of images. JPEG, GIF,
and PNG files are often of a small enough size to be deliverable via a network, but the
network delivery of TIFF and PDF files depends on their content and intended use.
Master Files
The goal of a master image file is to provide a high-quality, unedited information rich copy
and to prevent the need for re-digitisation in the future39. A master file should capture as
much information as possible from the original document, and should be of the highest
quality possible. The quality available to derived files depends entirely on the quality of the
master files – a poor quality master file can only result in poor quality images derived from
it. The creation, use and storage of master files should be subject to strict quality control.
The image resolution is the main variable factor in controlling the quality of images from
scanning equipment. Using higher resolutions than those recommended in this document
will provide a higher quality image. However, as described earlier, colour depth should be
selected based on the characteristics of the paper record, and there is no real benefit in
increasing the colour depth beyond the recommendations provided earlier in this chapter
when creating a master file.
Master files should be protected from damage through excessive handling or overuse of
media and also kept secure from deliberate change or deletion. In many ways, the
traditional process of microfilming records is similar to this aspect of the digitisation
process. Just as several copies of microfilm are made for a variety of purposes, several
copies of an imaged record may be made to preserve the image while providing access.
Derived files
Several types of files that can be derived from the master image file are described below.
The process of creating these derivatives will vary, but should be implemented as a routine
part of the digitisation process. Derived files typically have a smaller file size than the
original images and the storage of these extra files should be considered when
determining the system requirements for storage of digitised records.
It is possible for derived files to be generated by the system when they are requested. For
example, the Microsoft Windows XP operating system automatically generates thumbnails
when a user views a directory of images. However, for a multi-user system such as an
eDRMS, the one time generation and then storage of derived files will probably be more
efficient. The relatively small amount of computer storage required for the storage of

39
Western States Digital Imaging Best Practices Version 1.0. 2003. Western States Digital Standards Group. Accessed March 2005 at
http://www.cdpheritage.org/resource/scanning/documents/WSDIBP_v1.pdf

43
Queensland State Archives: Guideline for the Digitisation of Paper Records

derivatives is generally preferred to using the processing required to generate derivatives


on the fly. Any stored derivatives should be managed, so that they are updated or deleted
in sync with the master file.
Some image processing and management software may have the ability to modify the
appearance of a digitised record by adding information such as the date or organisation
name. Two such techniques are watermarking and fingerprinting. Watermarking is the
inclusion of static information on an image at time of storage, perhaps the name of the
organisation and date of capture. Fingerprinting typically includes information generated
when the image is accessed, such as login name of the end user and date / time
information.
While this information may be useful, and the inclusion of it as part of the image
convenient, it may be considered that these modified images are no longer a true and
accurate copy of the original paper records. This may be especially relevant where this
added information, such as a large watermark through the text, makes the content of the
record difficult to read. Public authorities should consider retaining the master image as
an unmodified representation of the paper record or captured this additional information as
metadata rather than part of the image. Any allowed modifications must be documented in
policies and procedures.
Combined with establishing appropriate processes for accessing files, using derived
copies of a master file, and leaving the master itself in off-line secure storage can help
reduce the risk of corruption, alteration or loss of the master files. Derived images should
be managed, usually in an eDRMS or other system, to ensure they are accessible with
similar constraints and permissions as the master image and paper record. Both master
and any derived versions of a scanned record should be destroyed when the paper record
is authorised for destruction, unless there is a business need for on-going retention.
• Backup copies
Exact copies of the master file may be made and stored in separate locations to avoid a
total loss of information in the event of a disaster such as a fire or computer virus attack
• Access copies
The most commonly used version of the scanned record should be a complete and
accurate representation of the paper record, while having characteristics that allow it to be
easily distributed, printed and viewed. For digitisation programs with the aim of providing
improved access to records or integrating with electronic systems, the access version may
be the only derivative required. Access files should meet the technical recommendations
for file format and resolution provided earlier in this chapter. PDF is recommended for use
as an access file format, provided the images encapsulated in the PDF file are retained
separately.
Access images are typically of a lower quality than master images in a format that is more
accessible to a wider audience. Access copies normally compromise on quality, perhaps
with a more severe compression ratio, so that files can be easily accessed though on-line
storage, over a network connection or via a web page.
The purpose of an access image should dictate its characteristics. If an access image is
provided for viewing on a computer monitor, it should be sized to fit within the viewing area
of an average monitor. If the file is to be accessed through a low bandwidth network
connection, minimising file size would be critical.
• Thumbnails

44
Queensland State Archives: Guideline for the Digitisation of Paper Records

Thumbnail images are very small images designed to display instantaneously on web
pages and file management software, allowing users to determine whether they want to
view an access image. Thumbnails are best used when dealing with a collection of
pictorial images, but they are not very useful for images of text documents due to the
difficulty in determining the textual content within a very small image40.
• Text
Using optical character recognition (OCR), the text depicted in a scanned paper document
can be extracted as a text file or word processor document. OCR software is required to
recognise the text contained in the image and usually provides search and export
capabilities. OCR is rarely a fully automated process and may require operator
intervention to assist in obtaining an accurate transcription of the scanned record’s text.
Documents containing handwriting, serif fonts41, halftones, and background text or images
or those that are damaged or dirty may not be suited to the OCR process.
As with other derived files, if OCR is used to derive a text document from an imaged
record, this additional file should be managed by the system.

Recommended Derivatives

Public authorities digitising records to improve access and integrate into other business
systems should aim to create images that meet the technical recommendations
provided earlier in this chapter.
Those organisations with additional capacity, wishing to take advantage of future
technical advances without having to rescan paper records, should consider creating a
high quality master copy in addition to an access copy. Any modification of the
appearance of the image should be performed using a copy, with an unmodified original
also retained.
The systems used to manage the digitised paper records should be tested for their
ability to manage multiple images if the decision to create derived versions of imaged
records is made.

40
Western States Digital Imaging Best Practices Version 1.0. 2003. Western States Digital Standards Group. Accessed March 2005 at
http://www.cdpheritage.org/resource/scanning/documents/WSDIBP_v1.pdf
41
A small decorative line added as embellishment to the basic form of a character. Typefaces are often described as being serif or
sans serif (without serifs). The most common serif typeface is Times New Roman. A common sans serif typeface is Arial.

45
Queensland State Archives: Guideline for the Digitisation of Paper Records

7: Metadata
Metadata is often described simply as 'data about data'. A more useful definition of
metadata is “structured information that describes and/or allows us to find, manage,
control, understand or preserve other information over time”42. While metadata has always
been utilised in the recordkeeping and archival professions, it has only been described as
'metadata' for the past decade. Many routine operations such as the profiling of records
and files, cataloguing of library resources, and describing archival items can all be
described as metadata collection.
Metadata describes information objects or resources and may be used for many purposes,
including the management, control and discovery of records43. As outlined in Information
Standard 34 – Metadata (IS34), metadata must be used to maintain the context, content
and structure of records in electronic environments. The creation, retention and
preservation of metadata is integral to the concept of records as evidence44. Ideally the
collection of metadata for digitised documents will be part of an agency-wide metadata
strategy which is consistent with the requirements of IS34.
A metadata standard (also known as a schema) provides a list of the elements that define
the individual pieces of information that should be captured to describe the record. Use of
a metadata standard as opposed to a locally developed set of metadata elements will:
• encourage best practice;
• assist the end users;
• avoid redoing work that has already been done elsewhere;
• provide system vendors with certainty; and
• support interoperability between applications.45
System developers, vendors and records management staff should have a good
understanding of metadata standards to facilitate their implementation in an organisation.
Vendors and system developers providing business information systems and applications
used to manage digitised records of public authorities should ensure the capability exists
to record metadata to the appropriate standard.46 Records managers should be involved
in liaising with these parties to certify that appropriate metadata is able to be captured,
stored and managed in the system they are purchasing.
Tools such as templates and data entry forms which facilitate the entry of metadata in a
user friendly manner may be supplied by the vendor or developed in house. Additionally,
records managers should develop in-house metadata procedures and policies to suit their
particular business requirements.
Users of locally developed systems, or those who manually record metadata in a
spreadsheet or records ledger, will need to be aware of the elements that make up the
relevant metadata standards and aim to record them correctly.

42
DIRKS – Glossary. 2001. National Archives of Australia. Accessed March 2005 at
http://www.naa.gov.au/recordkeeping/dirks/dirksman/glossary.html
43
Glossary of Archival and Recordkeeping Terms. 2004. Queensland State Archives. Accessed March 2005 at
http://www.archives.qld.gov.au/downloads/GlossaryOfArchivalRKTerms.pdf
44
Information Standard 34, Metadata. 2004. Office of Government ICT. Accessed March 2005 at
http://www.governmentict.qld.gov.au/02_infostand/standards/is34.htm
45
Cunningham A. Metadata Standards in Australia – An Overview. 2005. Presentation at Queensland State Archives March 2005.
National Archives of Australia.
46
Overview of Classification Tools for Records Management. 2003. National Archives of Australia. Accessed March 2005 at
http://www.naa.gov.au/recordkeeping/control/tools.pdf

46
Queensland State Archives: Guideline for the Digitisation of Paper Records

Capturing and maintaining accurate metadata for digitised paper records is essential, as
information can not be effectively managed or used without metadata. The scope of this
document does not extend to providing a full description of recordkeeping metadata and its
use, which applies to all records and recordkeeping systems. Instead the focus will be
placed on the recording and management of specific metadata pertaining to digitised
records.
7.1 Metadata Types
There are many types of metadata tailored to data types including geographic information,
financial applications, and rights management. Three main metadata types apply to the
digitisation process and subsequent management of records, namely resource discovery,
recordkeeping, and technical imaging metadata.
Resource Discovery Metadata
Resource discovery metadata contributes to enabling the discovery and management of
online information resources. Information Standard 34: Metadata mandates the use of the
Australian Government Locator Service (AGLS) metadata standard, or other standards
compatible with it, for all information resources, including records.47
Information Standard 34: Metadata requires that public authorities must, at a minimum,
adopt metadata schemes that are interoperable with the AGLS Metadata Element set and
are consistent with the Queensland Government AGLS Element Implementation Standard.
AGLS is a resource discovery standard and does not include elements required for
records management processes, such as disposal.
Recordkeeping Metadata
Information Standard 40, Recordkeeping, recommends the use of the National Archives of
Australia’s Recordkeeping Metadata Standard for Commonwealth Agencies. This standard
is compatible with AGLS and extends AGLS by included elements required for managing
records.
The recording and maintenance of recordkeeping metadata for digitised records can assist
with:
• A means of searching and identification,
• Authentication,
• Preservation of the content and context,
• Information on retention and disposal,
• Auditing and restriction of use, and
• Interoperability with other systems.48
An Australian recordkeeping metadata standard is currently under development. It is likely
that the National Archives of Australia’s metadata standard will be revised in the future to
align with the national standard. It is expected this national standard will include guidance
on implementation. Further information on recordkeeping metadata can be found in the
Public Records Alert, Understanding and Applying Recordkeeping Metadata49

47
For further information on AGLS including the list of elements see National Archives of Australia’s AGLS implementation manual at
http://www.naa.gov.au/recordkeeping/gov_online/agls/cim/cim_examples.html
48
Cunningham A. Metadata Standards in Australia – An Overview. 2005. Presentation at Queensland State Archives March 2005.
National Archives of Australia.
49
Public Records Alert No 2/05: Understanding and Applying Recordkeeping Metadata. 2005. Queensland State Archives. Accessed
March 2005 at http://www.archives.qld.gov.au/publications/PublicRecordsAlert/PRA205.pdf

47
Queensland State Archives: Guideline for the Digitisation of Paper Records

Technical Imaging Metadata


Information about the digitisation process such as the operator’s login name, date of
scanning, technical scanning parameters and equipment used should be recorded for
digitised records. The National Information Standards Organisation (NISO) has developed
a detailed draft metadata standard for still images. The draft standard for trial use was
released in June 2002, and is awaiting approval. The draft standard contains four
categories of metadata elements, described below50.
• Image parameters: items that are fundamental to the reconstruction of the digital file
as a viewable image on electronically interfaced displays.
• Image creation: descriptive technical metadata; information with respect to the
logistics and administrative conditions surrounding digital image data capture.
• Imaging performance assessment: attributes of the image inherent to its quality.
• Change history: documenting processes applied to image data over the lifecycle of an
image.
While the above NISO draft standard would comprehensively describe a digital image, it
does not include other important information relating to recordkeeping which will be
covered by the recordkeeping metadata elements. Overlap of metadata elements when
multiple schemas are used should be addressed to ensure that the information is only
recorded once, but referenced in all schemas.
The following table shows an example of technical metadata collected by the State Library
of Queensland to facilitate management, storage and preservation of digital files51.
Element Example
Accession number 99-5-4
File name 2468.tif
Digital image creator Image Production Unit
Link date 22 May 2002 (automatically generated by ImageServer)
Capture device Heidelberg Nexscan
Capture software Silver Fast
Manipulation Photoshop 6
Digital source Copy negative
Scan resolution 600dpi
File bit depth 8bit
Format tiff
Colour Greyscale
Image manipulation (level contrast unsharpmask)
Colour profiles No
Table 9: State Library Queensland Technical Metadata Elements
Other metadata elements that may be considered include file size, compression technique,
creation hardware, creation software and creation methodology52.
Appropriate metadata for digitised paper records will depend on the agency and the
rationale of the digitisation project. Choosing a small number of metadata elements may
cut down on data entry time, but may result in vital information missing from the collection
and limit resource discovery. Deciding to record an excessive number of metadata

50
Data Dictionary—Technical Metadata for Digital Still Images. 2003. National Information Standards Organization and
AIIM International. Accessed March 2005 at http://www.niso.org/standards/resources/Z39_87_trial_use.pdf
51
Digital Standard 1 – Cataloguing and Metadata for Digital Images. 2003. State Library of Queensland. Accessed March 2005 at
http://www.slq.qld.gov.au/__data/assets/file/5449/sd1_meta_v1.2.doc
52
Metadata and Digital Images. 2004. Technical Advisory Service for Images. Accessed March 2005 at
http://tasi.ac.uk/advice/delivering/metadata.html and Suggested Technical Metadata Elements. 2004. Indiana Digital Library. Accessed
March 2005 at www.statelib.lib.in.us/www/isl/diglibin/techmeta.pdf.

48
Queensland State Archives: Guideline for the Digitisation of Paper Records

elements will create a metadata database that is very large. A large number of manually
entered metadata elements will also burden staff who are required to enter the information,
potentially leading to a lack of attention to detail and a poor quality collection. Selecting
the metadata elements to record, and determining which of these are mandatory or
optional to record should be part of the early phases of the digitisation project.
7.2 Capturing Metadata
As the manual collection and entry of metadata is a mundane task, automating the entry of
as many elements as possible should be a priority. Some metadata, such as titles and
comments, needs to be manually collected and linked to the digitised file. Other metadata,
such as recording the technical properties of a scanned image, can be collected
automatically. Table 10 shows a small sample of the metadata elements that can be
automatically captured.
Element Source Comment
Operator name Computer operating system Name can be extracted from the
login account of the user
Capture device Computer operating system Including hardware, software,
driver versions
Date of capture Computer operating system
Device calibration results Device driver
Time since calibration Device driver / Computer
operating system
Image resolution, colour depth, Imaging software
compression, file format & sub-
formats
Image file name Imaging software
Image’s parent collection details Imaging software Entered initially by the operator
and retained for subsequent
images until changed

Table 10: Some metadata elements that may be automatically captured


The automation of the entry of these technical details will allow staff to concentrate on
entering the details that cannot be automatically captured. Any metadata entry, including
those elements captured automatically, need to be periodically or randomly checked to
ensure it is being captured correctly.
Creating and maintaining metadata requires on-going resources, staff and perseverance.
Collection of metadata is most cost-effective if it occurs at the time a digital file is created,
as retrospectively collecting metadata is a large and tedious task.
The processes for metadata collection should be easy to use and follow and appropriate
support (such as manuals) should be provided53. Training for staff involved in the creation
and maintenance of metadata is critical to the successful collection of metadata.
7.3 File Naming Conventions
In many systems, a file name is the primary identifier for electronic files and places
electronic records in context with other electronic records and activities54. Consideration of

53 th
Thornely J. The How of Metadata: Metadata Creation and Standards. 1999. 13 National Cataloguing Conference, October 1999,
Accessed March 2005 at http://www.slq.qld.gov.au/__data/assets/file/6289/How_of_Metadata.doc

49
Queensland State Archives: Guideline for the Digitisation of Paper Records

file naming conventions is important in a digitisation project as consistent file naming


allows easy management, retrieval and use of digital records. Ideally, file naming within
digitisation projects should be part of an agency-wide file naming policy for electronic
records.
In larger digitisation implementations, where a system such as an eDRMS is used to
manage the imaged records, end users may not have any interaction whatsoever with the
actual image files. Unlike these large systems where the file names of the image files are
essentially transparent to the users, smaller systems and manual digitisation processes
may require the naming of files in a consistent and understandable manner.
Descriptive File Names
General guidelines for file naming conventions suggest that file names should be unique,
follow a predictable pattern, provide a meaningful title and be structured and easy to
follow. File names should use standard extensions such as .jpg, .tif. In addition, file names
should avoid spaces, tabs and characters used by the system (such as “/”, “?”, and “|”) that
might not work across platforms or be incompatible with the operating system and storage
media used for digital files.
The creator of an electronic file is responsible for naming the file so that the file can be
communicated to other people. Inconsistent naming of files can make locating files
problematic, lead to frustrating searches and wasted time, and may result in information
being unavailable when it is needed. These principles apply to descriptive file names that
are manually generated and non-descriptive file names that may be generated
automatically by software.
A number of different elements may be included in a file naming convention. Some
suggested elements may include the following.55
Item Example Filename Component
Version Number version 1 v1, vers1
Date Of Creation February 24, 2001 022401, 02_24_01
Name Of Creator Rupert B. Smith RBSmith, RBS
Description Of Content media kit medkit, mk
Name Of Intended Audience general public pub
Name Of Group Associated With The Committee ABC CommABC
Record
Release Date released on June 11, 2001 at 61101_0800CT
8:00 a.m. central time
Publication Date published on December 24, pub122403
2003
Project Number project number 739 PN739
Department Number Department 140 Dept140
Records Series SeriesX s_x
Table 11: Inclusion of file characteristics as parts of the file name
Another consideration in file naming conventions is how derivatives are distinguished from
master files. The State Library of Queensland56 provides detailed naming conventions
where a qualifier is added to the filename to indicate if the file is a master, preview,
research or thumbnail image, as shown in Table 12.

54
Electronic Records Management Guidelines: File Naming. 2004. Minnesota State Archives. Accessed March 2005 at
http://www.mnhs.org/preserve/records/electronicrecords/erfnaming.html
55
Electronic Records Management Guidelines: File Naming. 2004. Minnesota State Archives. Accessed March 2005 at
http://www.mnhs.org/preserve/records/electronicrecords/erfnaming.html
56
Digital Standard 2 – Digital capture, format & preservation. 2003. State Library of Queensland Accessed March 2005 at
http://www.slq.qld.gov.au/__data/assets/word_doc/12788/sd2_digcapture1.doc

50
Queensland State Archives: Guideline for the Digitisation of Paper Records

If an embedded file naming convention is used, the derivative information could be


included in the folder structure. In some cases, the file format extension may be sufficient
to distinguish between derivatives. It may be beneficial for derived versions of a file to be
able to be linked back to the master file through the file name57. If the file name of a
derivative is going to be used to retrieve higher quality versions of a file, then the derivative
file name must include enough information to link it to other versions as well as the master
file.
File Type Filename
Suffix
Master .tif
Preview p.jpg
Research r.jpg
Thumbnail b.jpg
Table 12: Derivative Qualifiers used by the State Library of Queensland
Non-descriptive File Names
The file naming conventions discussed so far have provided meaningful names for digital
files. An alternative to meaningful names is to have non-descriptive file names. Non-
descriptive file names are often computer or machine generated numerical strings, which
have little or no correspondence to the content of the file58. Many software packages
designed for the rapid digitisation of a large amount of paper records will use sequential
numeric or alphanumeric filenames – the operator setting the starting point for the file
name and the software automatically incrementing as subsequent files are digitised.
If non-descriptive file names are used, the files must be associated with file metadata to
allow the file to be identified. It is crucial that the link between the image file and the
descriptive information is maintained, as the file names will probably be meaningless to the
human eye.

Recommended Metadata

In terms of metadata, imaged records should be described in a similar fashion to their


paper originals, with the additional recording of information about the digitisation
process. A standard, such as the Recordkeeping Metadata Standard for
Commonwealth Agencies, should be used, which also fulfils the Information Standard
34 requirement for compliance with AGLS. Additional metadata should be recorded to
document the image using an appropriate technical image metadata schema. In
addition, disposal metadata should capture information about the disposal of both the
original paper record and the image.
If acquiring a new system for the management of digitised records, system vendors
should be approached regarding the capability of their system to record metadata to
defined standards in an unobtrusive and convenient manner.
File naming conventions need only be considered for small systems where the image
files are accessible to the end users. When descriptive filenames are used, they should
avoid incompatibilities with operating systems and be understood by appropriate staff.

57
Technical Guidelines for Digitizing Archival Materials for Electronic Access. 2004. National Archives and Records Administration
(US). Accessed March 2005 at http://www.archives.gov/research_room/arc/arc_info/techguide_raster_june2004.pdf
58
Guidelines for management, appraisal and preservation of electronic records. 1999. Public Record Office, The National Archives
(UK). Accessed March 2005 at http://www.nationalarchives.gov.uk/electronicrecords/advice/pdf/procedures2.pdf

51
Queensland State Archives: Guideline for the Digitisation of Paper Records

8: Storage and Media


Options
Cache
Digital records can be stored in a
variety of ways, each having an impact
Cost per
on the convenience, security and Access
Hard Disks Mb
Time
longevity of the records. Paper records
may be secure and preserved if stored
in a climate controlled archive Secondary
hard disks /
repository, but can be inconvenient to optical library
access. On the other hand, while a file
sitting on a desktop can be accessed
almost instantly, its security and Tapes
integrity can easily be compromised.
As the characteristics of access,
security and preservation are often Capacity
given as the impetus of a record
digitisation program, the appropriate Figure 8: The relationship between access time, cost
storage of digital records is crucial to and capacity of various computer storage types.
the program’s success.
8.1 On-line, Near-line, and Off-line Storage
For computer files, including digitised paper records, options for storage can be
categorised as on-line, near-line and off-line. A digitisation program will normally make
use of a combination of storage options, for instance off-line storage for master copies of
digitised images with online storage for derived versions of a file. Ideally, at least the
master file will be duplicated in a number of locations to ensure against accidental
damage, corruption or media deterioration, and to provide disaster recovery options.
On-line storage refers to the storage of files on networks or hard disks so that files are
immediately available to a computer system. On-line storage provides the fastest and
easiest access to files, but is often expensive. If the files are large, high bandwidth
networks may be needed to provide fast and easy access. On-line storage is normally the
most limited in the amount of storage space that is offered.
Near-line storage offers a combination of lower costs than on-line storage at the expense
of access speed. Near-line storage media includes magnetic disks, magneto-optical
storage and robotically accessed optical media and tape libraries. Files that are stored
using near-line storage should still be accessible without human intervention, but take
longer to access than on-line files, so are best suited to infrequently accessed materials.
Near-line storage capacity can normally be easily increased to meet demand.
On- and near-line storage allows digital files to be quickly supplied to users. Increased
accessibility does increase the risk of unauthorised access to or tampering with digital
files. Bandwidth and security issues need to be considered for on- and near-line storage
to be successfully deployed59.
Off-line storage is often used to store rarely accessed records or backup copies. It is
typically inexpensive, but access costs to files can be very high since the media requires

59
Frey F. Guides to Quality in Visual Resource Imaging: 5. File Formats for Digital Masters. 2000. Council on Library and Information
Resources. Accessed March 2005 at http://www.rlg.org/visguides/visguide5.html

52
Queensland State Archives: Guideline for the Digitisation of Paper Records

manual location and loading60. Access to files is normally accompanied by delays as the
files are located and the sequential nature of tape storage adds further delays.
8.2 Media Types
Magnetic Media
Magnetic media includes magnetic tapes and magnetic disks such as hard drives and
floppy drives61. Magnetic media usually has a lifespan of 10 to 20 years, although the
lifespan of magnetic media can be extended if appropriate storage conditions are used62.
Magnetic media can be used to provide offline, near-line or on-line access to files.
Magnetic tapes have a relatively low cost with large storage capacities of up to several
hundred gigabytes per tape. Tapes are sequential media and as such cannot be
considered as an alternative to random access media such as disks for routine access to
information. Instead magnetic tapes are widely used for long term, offline file storage or
backup63, where their low cost per megabyte is most appropriate. Robotically controlled
tape libraries can provide for a huge volume of information to be available near line.
Magnetic disks have a higher cost than magnetic tape, but with the benefit of providing
faster access to information. Personal computer hard disks and server disk arrays are
examples of magnetic disks. Magnetic disks are generally used for online storage,
however, it is becoming commonplace for cheaper, lower performance disks to be used as
an online backup or spare in a near line capacity.
Optical Media
Optical media use lasers to read data from a metallic coating on a disk. Optical media
include Compact Disks (CDs) and Digital Versatile Discs (DVDs). CDs and DVDs may be
read only (e.g. CD-ROM), writeable once (such as CD-Rs) or writeable many times (e.g.
CD-RW, DVD+RW). Optical disks are a common media used for digital file storage,
transportation and publication.
The main advantage that optical media have over magnetic media is that its life
expectation is more predictable, as its longevity is determined by the properties of the
optical material rather than wear and tear on the media64. Optical media can provide near-
line or off-line storage.
It is not possible to alter or delete information from write once, read many (WORM) optical
media. This provides an assurance that the imaged records that they store have not been
altered or deleted. Conversely, the implementation of disposal decisions for digitised
records stored on WORM media can be complicated if imaged records of differing
retention period are stored on the same disk. The nature of this type of media means that
it cannot be reused.

60
Creating and Managing Digital Content – Capture Your Collections. 2002. Canadian Heritage Information Network. Accessed March
2005 at http://www.chin.gc.ca/English/Digital_Content/Capture_Collections/maintenance.html
61
The Preservation Management of Digital Material Handbook, Chapter 5: Media and Formats. 2002. Digital Preservation Coalition.
Accessed March 2005 at http://www.dpconline.org/graphics/medfor/media.html
62
Frey F. Guides to Quality in Visual Resource Imaging: 5. File Formats for Digital Masters. 2000. Council on Library and Information
Resources. Accessed March 2005 at http://www.rlg.org/visguides/visguide5.html
63
Electronic Records Management Guidelines: Digital Media. 2004. Minnesota State Archives. Accessed March 2005 at
http://www.mnhs.org/preserve/records/electronicrecords/erdigital.html
64
Frey F. Guides to Quality in Visual Resource Imaging: 5. File Formats for Digital Masters. 2000. Council on Library and Information
Resources. Accessed March 2005 at http://www.rlg.org/visguides/visguide5.html

53
Queensland State Archives: Guideline for the Digitisation of Paper Records

Optical media is normally a cost effective option for storage. Media may have a unit cost
of only a few cents when purchased in reasonable quantities. However, recording and
retrieving files from optical media can often be slow, and locating the disk that a file is
stored on may complicate access to digitised documents. Optical media work well as a
storage option for small projects. Using optical media in large projects may lead to high
storage and retrieval costs65.
8.3 Media Lifecycle
A range of media is available for storing digital files in on-, near- or off-line capacities.
Choosing an appropriate medium for storage of digital files is important to ensure on-going
accessibility to the file. The rate of media obsolescence and reliance on hardware and
software for access to media requires that careful consideration is given to the media used
in digitisation projects.
Media types may lose popularity and be difficult to
read due to lack of available equipment over a period
of time. For example, reading Beta video cassettes or
5.25” disks is problematic, because the hardware
required is no longer readily available. Digitised
documents will need to be copied to new media if they
are to remain accessible. Generally, the lifetime of
hardware and software is shorter than the lifetime of
digital media. A five year timeframe has been
suggested for data refreshing (copying of files to a
new media)66.
Media Life and Deterioration
Unlike paper documents, digital files cannot be easily
examined to determine if the file is still legible. Most
digital media becomes obsolete or loses information
faster than words produced on paper. Hardware and
software are also required to interpret and display a
digital file so that the file’s legibility can be checked.
Hence the storage of digital files requires on-going,
regular maintenance to ensure that files remain
readable by contemporary hardware and software,
and to ensure that the media the files are stored on
does not decay.
Different media have varying life expectations. For Figure 9: Media Life Expectancy.
example, microfilm is expected to have a shelf life of From http://www.caps-project.org/cache/
DigitalMediaLifeExpectancyAndCare.html
500 years, whereas Compact Disc (CD) life may be as
short as 2 years. Recently some manufacturers have
released “archive quality” CD and Digital Versatile Disc (DVD) media which are said to
have a shelf life of up to 50 years, but many CDs, DVDs and tapes lose data within a very
short period after their creation (2-30 years67).

65
Western States Digital Imaging Best Practices Version 1.0. 2003. Western States Digital Standards Group. Accessed March 2005 at
http://www.cdpheritage.org/resource/scanning/documents/WSDIBP_v1.pdf
66
Rothenberg, J. Ensuring the Longevity of Digital Information. 1999. Council on Library and Information Resources. Accessed March
2005 at http://www.clir.org/pubs/archives/ensuring.pdf
67
The Preservation Management of Digital Material Handbook, Chapter 5: Media and Formats. 2002. Digital Preservation Coalition.
Accessed March 2005 at http://www.dpconline.org/graphics/medfor/media.html

54
Queensland State Archives: Guideline for the Digitisation of Paper Records

Often, longevity of storage media is less important than adequate plans to migrate and
refresh files for compatibility on contemporary hardware and software68.
Media Refreshing
To ensure the continued accessibility of digitised records and to prevent information loss, a
testing and re-mastering schedule should also be implemented and a strategy drawn up
for migrating the images and metadata to new media and new formats when necessary.69
Known as refreshing, this may involve copying the contents from one type of technology to
another in order to prevent records from being left on media which can no longer be read.
Alternatively, refreshing may be from one piece of media to another of the same
technology ensuring that pieces of media are replaced before they fail.70 The records
should be verified following the refresh process.
Expungement of digitised records
When the retention period of a record has been reached and it is scheduled for
destruction, any digital copies should also be destroyed at the same time as the paper
record. Care should be taken to destroy, overwrite, or carry out secure deletion on
computer storage media and devices used in the storage of records. As discussed above,
there can be some difficulty in achieving this when using WORM media – in this case files
to be retained could be copied onto new media, with the old disc destroyed.
Organisational IT policies, such as system backup, should also be examined to ensure
that digital copies of records are no longer preserved following their destruction.

Recommended Storage Options

If digitisation programs are established to improve access to information, online


storage is the most appropriate option. In addition, disposal of digitised records needs
to occur in an online eDRMS environment.
In other circumstances, digitised documents may be stored using magnetic or optical
media.
Regardless of storage arrangements, media must always be handled appropriately
and stored in environmental conditions recommended by manufacturers.
When choosing storage media, it is important to determine the manufacturer’s life
expectancy for the particular media. Where possible, the highest quality media should
be used. The choice of digital media will be influenced by factors such as media
lifespan, cost and ease of access to files stored using the media.
It is also important to ensure that digital files, particularly master copies, are safe from
tampering so the original image cannot be changed. If the system used to manage
the digitised records does not provide adequate assurance of the authenticity and
integrity of the record, consideration should be given to the use of WORM media, such
as some compact discs, to prevent alteration of the original file. The use of WORM
media must be planned carefully to ensure that retention and disposal decisions can
still be implemented.

68
Western States Digital Imaging Best Practices Version 1.0. 2003. Western States Digital Standards Group. Accessed March 2005 at
http://www.cdpheritage.org/resource/scanning/documents/WSDIBP_v1.pdf
69
Digital Preservation and Storage. 2004. Technical Advisory Service for Images. Accessed March 2005 at
http://www.tasi.ac.uk/advice/delivering/digital.html

55
Queensland State Archives: Guideline for the Digitisation of Paper Records

Appendix 1: Glossary of Terms and Acronyms

Acronyms
AGLS Australian Government Locator Service
BMP Bitmap
CD Compact Disc
DPI Dots Per Inch
DVD Digital Versatile Disc
GIF Graphics Interchange Format
JIFF JPEG File Interchange Format
JPEG Joint Photographic Expert Group
PCX PC Paintbrush Format
PDF Portable Document Format
PNG Portable Network Graphics
PPI Pixels Per Inch
RAM Random Access Memory
ROM Read Only Memory
TIFF Tagged Image File Format
WORM Write Once Read Many

Glossary
Anti-Aliasing Imporives the appearance of grey scale images by adding grey pixels at
the border of black and white areas, smoothing the transition from black
to white. Also used in colour images to smooth transitions between
colours.
Bit Depth The number of bits used to describe the colour of each pixel. Greater bit
depth allows more colours to be used in the colour palette for the image.
Bi-tonal Images containing only black and white pixels. Bi-tonal images are often
used to represent modern, non-illustrated text documents.
Colour The colour or bit depth of an image refers to the number of bits used to
Depth describe the colour of each pixel. Greater bit depth allows more colours to
be displayed in an image. Colour depths can range from 1 bit per pixel
for bi-tonal images to 24 bits per pixel or greater in high quality colour
images.
Continuous An image, such as an original photographic transparency or print, in
Colour which the tones or colours blend smoothly from one to another.
Continuous colour images have a virtually unlimited range of colour or
shades of greys.
Discrete Instances when the colours in an image are separate and distinct.

70
VERS Advice 10: System Requirements for Preserving Electronic Records. 2004. Public Record Office Victoria. Accessed March
2005 at http://www.prov.vic.gov.au/vers/standard/advice_10/3-8.htm

56
Queensland State Archives: Guideline for the Digitisation of Paper Records

Colour Discrete colour images do not blend smoothly from one colour to the next
and lack the many shades of colour seen in photographs.
Dithering The computer graphics equivalent to printed halftones, this technique
creates the illusion of colour depth in images with a limited colour palette.
This is done by interspersing pixels of different colours over the required
area to give the appearance of a third colour. For example, white and
black pixels allocated over an area will provide a grey appearance to that
area.
Dots per A measure of the resolution of a printer. It refers to the number of dots
Inch the printer is able to place in a linear one-inch space. The more dots per
inch, the higher the resolution and the higher the printing quality.
File format The specific way that data is arranged in a file. Some file formats can be
used by a range of applications (such as text files or some image files)
while others may only be used by a specific application (usually the same
application used to create the file).
Most applications can save documents in one or more standard formats
as well as in their native format (i.e. a document produced in Microsoft
Word can be saved as a Word document, or in rich text format, or in
WordPerfect format). File formats may be proprietary or non-proprietary.
Greyscale Greyscale images use only black, white and a range of shades of grey.
The number of grey shades available depends on the colour depth of the
image.
Half-tone A printed image in which the density and pattern of black and white dots
are varied, giving the appearance of a continuous tone image when
viewed from an appropriate distance. Half-tone images are used
extensively in magazines and newspapers.
Lossless The compression of data that guarantees the original data can be
compression restored exactly. A file that compressed using a lossless method and
then retrieved is exactly the same as the original, uncompressed file.
Lossy The compression of data that may result in some data being changed or
compression lost. A file that is compressed using a lossy method and then retrieved
may be different from the original file, but is "close enough" to be useful in
some way.
LZW A lossless compression algorithm developed by Abraham Lempel, Jacob
Ziv, and Terry Welch. Lempel-Ziv-Welch is a proprietary lossless data-
compression algorithm used in GIF files. The patent to the LZW algorithm
is owned by Unisys Corporation.
Naming A standardised approach to naming computer files.
conventions
Near-line Storage of files, normally on magnetic or optical media, so that files can
storage be accessed if needed. The accessing of files in near- line storage
should not require human intervention, as in the case of off-line storage,
but will usually be slower to access than on-line storage. Robotically
controlled tape libraries and CD/DVD jukeboxes are applications of near
line storage.
Non- Refers to a technological design or architecture whose configuration is
proprietary available for use by the public. Use of non-proprietary technology is not
restricted by licences or patents. Software is considered non-proprietary
once it is released with a license that would permit others to modify the

57
Queensland State Archives: Guideline for the Digitisation of Paper Records

software and release their own versions without restrictions. Non-


proprietary technology allows individuals or organisations to copy, modify
and study the technology.
Off-line Storage of files, normally on magnetic or optical media, in a manner
storage where the files are separate from and not directly accessible by the
computer system. Human intervention, such as loading a tape into a tape
drive, is required for the file to be accessed by the computer system.
On-line Storage of files, normally on networks or hard disks, so that files are
storage immediately available to the computer system.
Palette A palette is the set of available colours that may be used to display an
image. Each pixel in the image is assigned a value that relates to a
specific colour in the pallet. The number of entries in the palette is the
total number of colours which can appear simultaneously on screen
Palettised A type of image that is composed of a distinct set of colours from a
palette. Standard palettised images are made up of 16 or 256 colour
palettes.
Pixel The smallest element of a digital image; short for picture element. Pixels
are the many tiny squares that make up the representation of a digital
picture. Usually the squares are so small and so numerous that, when
displayed on a computer monitor or printed, they appear to merge into a
smooth image. Pixels per inch (PPI) is a commonly used measure for
digital images. Each pixel can represent a number of different shades or
colours, depending on how much storage space is allocated for it.
PPI A measure of the resolution of an image. The more pixels per inch, the
finer the resolution. PPI is used to describe the resolution of an image in
a virtual state, or on a monitor. ‘PPI’ is often confused with ‘DPI’, which is
used to describe the resolution of a printing device.
Proprietary A technological design or architecture whose configuration is unavailable
to the public and may not be duplicated without permission from the
designer or architect. Proprietary technology is created for a given
company's purposes. For example, Microsoft Word stores documents in a
proprietary format, namely Microsoft Word format. Proprietary technology
may be legally used only by a person or entity purchasing an explicit
license. Proprietary means "privately owned and controlled", and hence
software can remain proprietary even when source code is made publicly
available, if control over use, distribution, or modification is retained.
Raster A category of digital still images. Raster images are the most common
images created and used within digitisation projects. Raster images take
the form of a grid or matrix of pixels. Each pixel has a defined value that
precisely identifies its specific colour, size and place within the image.
Examples of raster image file formats are TIFF, GIF and JPEG. The
other category of digital still images is vector images.
Resolution Resolution is the amount of picture data in a specific area of an image.
Resolution is usually measured in pixels per inch (PPI). The higher the
resolution, the sharper and clearer an image will be.
Vector A category of digital still images. Vector images are defined by
mathematical equations and are used for drawing and diagrams that can
be constructed from points, lines and area shapes. Vector images are
resolution independent, meaning they can be scaled up to large sizes with

58
Queensland State Archives: Guideline for the Digitisation of Paper Records

no loss of quality. Examples of vector files formats are CAD drawings,


Corel Draw files, and SVG files. The other category of digital still images
is raster images.

59
Queensland State Archives: Guideline for the Digitisation of Paper Records

Appendix 2: Scanner Types


Flatbed scanner – Provides a flat glass area for scanning which
allows for thick documents or books to be scanned. Flatbed
scanners are at the entry level end of the market. Some models
have sheet-feeders, to increase the throughput of single sheets,
and transparent media adaptors, for scanning slides or
negatives, available as accessories.

Sheet-fed scanner – Dedicated to scanning separated pages,


typically at a much higher rate than a flatbed scanner with a
sheet feeder attached. This type of scanner is designed for a
high paper throughput and most models can scan both sides of
the page. Due to the sheet fed nature, this scanner is not
suited to scanning fragile or outsized documents or books.

Slide scanner – Designed specifically for digitising transparent materials such as slides
and negatives. This scanner type typically provides a higher throughput and improved
quality over a flatbed scanner with a transparent media adaptor, particular for scanning
high volumes of slides and negatives.

Drum scanner – Used in graphic design and in publication, these expensive scanners use
different technology from the other scanner types described here to produce a higher
quality image. The page being scanned is attached to a high speed rotating drum which
makes this type of scanner unsuitable for scanning fragile
documents or large volumes of records.

Wide-Format scanner – Used for scanning maps, plans and


other paper documents that are larger than typical office
documents. While standard sized documents are usually
accommodated, these scanners are usually manually fed,
meaning that they are not suited for high volume scanning of
standard office documents.

Overhead scanner – Also known as a book eye scanner, this


type of device captures the reader’s eye view of a book or document. These are often
used for capturing documents of historical or cultural significance that are not able to be
laid face down on a flatbed scanner or fed through a sheet-fed scanner. Overhead
scanners are not suited for high volume scanning.

Digital Camera – It is possible for a standard digital camera to capture a digital copy of a
paper record. Digital cameras should be used in macro mode for photographing objects
that close to the camera. A stable mount should be used to ensure the camera is steady
enough to accurately capture the object. It should be noted that photographic effects,
such as barrel distortion and fall off which affect the edges of objects captured by a
camera lens at close range, will be present in records captured using a digital camera.

60
Queensland State Archives: Guideline for the Digitisation of Paper Records

Appendix 3: Table of Technical Recommendations


Document type Resolution1 Bit Depth File Format
Text document with No less than 200 PPI 1-bit (bi-tonal) TIFF G3/G4
only one colour of PNG
text
Document with No less than 200 PPI 4-bit or 8-bit grey PNG
watermarks, grey GIF3
shading, grey
graphics, etc TIFF (LZW) 2,3
Document with No less than 200 PPI 4-bit or 8-bit colour PNG
discrete colour used GIF3
in text or diagrams,
etc TIFF (LZW)2,3
Black and white 9” x 6” - 300 PPI 8-bit grey PNG
photographs JPEG4
7” x 5” - 430 PPI
6” x 4” - 600 PPI TIFF (JPEG)2,4
Colour photographs 9” x 6” - 300 PPI 24-bit colour PNG
7” x 5” - 430 PPI JPEG4
6” x 4” - 600 PPI TIFF (JPEG)2,4

Notes:
1. Resolution may be reduced for images only used for on-screen viewing and should be
increased for documents that require enlargement for use. For documents larger than
A3, a resolution of 200PPI is generally accepted to provide a reasonable file size. The
clarity of fine line work and small text at this reduced resolution should be assessed.
2. For storing multi-page documents as a single file, TIFF may be considered as an
alternative.
3. The ability of software to manage any licensing required for LZW compression should
be checked.
4. The compression ratio used JPEG compressed images should not exceed 10:1.

61
Queensland State Archives: Guideline for the Digitisation of Paper Records

Appendix 4: Related Standards


Queensland Information Standards
Available from http://www.governmentict.qld.gov.au/02_infostand/infostand.htm
Information Standard 18: Information Security
Information Standard 31: Retention and Disposal of Public Records
Information Standard 34: Metadata
Information Standard 40: Recordkeeping
Information Standard 41: Managing Technology Dependant Records
File Formats
ISO 12639:2004: Graphic technology – Prepress digital data exchange – Tag image file
format for image technology (TIFF/IT)
International Organization for Standardization
ANSI/AIIM MS53-1993: Standard Recommended Practice - File Format for Storage and
Exchange of Images - Bi-Level Image File Format: Part 1
American National Standards Institute
AS ISO/IEC 15444.2-2004: Information technology - JPEG 2000 image coding system –
Extensions
Standards Australia
ISO/IEC 15948:2004: Information technology -- Computer graphics and image processing
-- Portable Network Graphics (PNG): Functional specification
International Organization for Standardization
ISO/DIS 19005-1: Document management -- Electronic document file format for long-term
preservation -- Part 1: Use of PDF 1.4 (PDF/A-1)
International Organization for Standardization
Image Management
ISO 10196:2003: Document imaging applications -- Recommendations for the creation of
original documents
International Organization for Standardization
ISO/TS 12029(ATS 5084-2003): Electronic Imaging – Forms design for optimisation for
electronic image management
International Organization for Standardization
ISO/TS 12033:2001(ATS 5083-2003): Electronic imaging -- Guidance for selection of
document image compression methods
Standards Australia
ISO/TR 14105(HB 177-2003): Electronic imaging - Human and organizational for
successful Electronic Image Management (EIM) implementation
International Organization for Standardization
Imaging
ISO 12651-1999: Electronic imaging – Vocabulary
Standards Australia
JIS Z 6016:2003: Electronic imaging process of paper documents and microfilmed
documents

62
Queensland State Archives: Guideline for the Digitisation of Paper Records

Japanese Standards Association


ANSI/AIIM TR26-1993: Resolution as it Relates to Photographic and Electronic Imaging
American National Standards Institute
ANSI/AIIM TR2-1998: Glossary of Document Technologies
American National Standards Institute
Metadata
ISO/TS 23081-1:2004: Information and documentation - Records management processes
- Metadata for records -- Part 1: Principles
International Organization for Standardization
ISO 15836:2003: Information and documentation - The Dublin Core metadata element set
International Organization for Standardization
AS 5044.1-2002: AGLS Metadata element set - Part 1: Reference description
Standards Australia
AS 5044.2-2002: AGLS Metadata element set – Usage guide
Standards Australia
Records Management
AS ISO 15489.1-2002: Records management - General
Standards Australia
AS ISO 15489.2-2002: Records management – Guidelines
Standards Australia
AS ISO 23081.1-2004: Information and documentation - Records management processes
- Metadata for records – Principles
Standards Australia
Scanners
ISO 12653-1:2000: Electronic imaging -- Test target for the black-and-white scanning of
office documents -- Part 1: Characteristics
International Organization for Standardization
ISO 12653-2:2000: Electronic imaging -- Test target for the black-and-white scanning of
office documents -- Part 2: Method of use
International Organization for Standardization
ISO/IEC 14473:1999: Information technology -- Office equipment -- Minimum information
to be specified for image scanners
International Organization for Standardization
ISO 16067-1:2003: Photography -- Spatial resolution measurements of electronic
scanners for photographic images -- Part 1: Scanners for reflective media
International Organization for Standardization
ISO 16067-2:2004: Photography - Electronic scanners for photographic images - Spatial
resolution measurements -- Part 2: Film scanners
International Organization for Standardization
ANSI/AIIM MS44-1988: Recommended Practice for Quality Control of Image Scanners
American National Standards Institute
ANSI/AIIM MS52-1991: Recommended Practice for the Requirements and Characteristics
of Original Documents Intended for Optical Scanning

63
Queensland State Archives: Guideline for the Digitisation of Paper Records

American National Standards Institute


Storage
ISO/TR 15801:2004: Electronic imaging -- Information stored electronically --
Recommendations for trustworthiness and reliability
International Organization for Standardization
ISO/TR 12037:1998(HB 179-2003): Electronic imaging -- Recommendations for the
expungement of information recorded on write-once optical media
Standards Australia
ISO/TR 12654:1997HB 178-2003: Electronic imaging - Recommendations for the
management of electronic recording systems for the recording of documents that may be
required as evidence, on WORM optical disk
International Organization for Standardization
ISO 18927:2002: Imaging materials -- Recordable compact disc systems -- Method for
estimating the life expectancy based on the effects of temperature and relative humidity
International Organization for Standardization
ANSI/AIIM MS59-1996: Media Error Monitoring and Reporting Techniques for Verification
of Stored Data on Optical Digital Data Disks
American National Standards Institute
ISO 22028-1:2004: Photography and graphic technology -- Extended colour encodings for
digital image storage, manipulation and interchange -- Part 1: Architecture and
requirements
International Organization for Standardization
ANSI/AIIM TR25-1995: The use of Optical Disks for Public Records
American National Standards Institute
There are a total of 80 standards covering Optical Media and a further 36 on the topic of
Data Storage available from Standards Australia

64
Queensland State Archives: Guideline for the Digitisation of Paper Records

Appendix 5: Reference List


Queensland State Archives Documents
Digitisation Disposal Policy: Policy on the authorisation of the early disposal of original
paper records after digitisation. 2006. Queensland State Archives. Accessed April 2006 at
www.archives.qld.gov.au/government/ddp.asp.
Glossary of Archival and Recordkeeping Terms. 2004. Queensland State Archives.
Accessed March 2005 at
http://www.archives.qld.gov.au/downloads/GlossaryOfArchivalRKTerms.pdf
Public Records Alert No 1/05: Day batching of records. 2005. Queensland State Archives.
Accessed March 2005 at
http://www.archives.qld.gov.au/publications/PublicRecordsAlert/PRA105.pdf
Public Records Alert No 2/05: Understanding and applying recordkeeping metadata. 2005.
Queensland State Archives. Accessed March 2005 at
http://www.archives.qld.gov.au/publications/PublicRecordsAlert/PRA205.pdf

Other Documents
Adobe PDF. 2005. Adobe Systems Inc. Accessed March 2005 at
http://www.adobe.com/products/acrobat/adobepdf.html
Brown A. Digital Preservation Guidance Note 1: Selecting File Formats for Long-Term
Preservation. 2003. National Archives (UK). Accessed March 2005 at
http://www.nationalarchives.gov.uk/preservation/advice/pdf/selecting_file_formats.pdf
Cunningham A. Metadata Standards in Australia – An Overview. 2005. Presentation at
Queensland State Archives March 2005. National Archives of Australia.
Creating and Managing Digital Content. 2002. Canadian Heritage Information Network.
Accessed March 2005 at http://www.chin.gc.ca/English/Digital_Content
Data Dictionary—Technical Metadata for Digital Still Images. 2003. National Information
Standards Organization and AIIM International. Accessed March 2005 at
http://www.niso.org/standards/resources/Z39_87_trial_use.pdf
Digital Imaging for Archival Preservation and Online Presentation: Best Practices. 2001.
Michigan State University. Accessed March 2004 at
http://www.historicalvoices.org/papers/image_digitization2.pdf
Digital Preservation and Storage. 2004. Technical Advisory Service for Images. Accessed
March 2005 at http://www.tasi.ac.uk/advice/delivering/digital.html
Digital Standard 1 – Cataloguing and Metadata for Digital Images. 2003. State Library of
Queensland. Accessed March 2005 at
http://www.slq.qld.gov.au/__data/assets/file/5449/sd1_meta_v1.2.doc
Digital Standard 2 – Digital capture, format & preservation. 2003. State Library of
Queensland Accessed March 2005 at
http://www.slq.qld.gov.au/__data/assets/word_doc/32645/sd2_current.doc.
The DIRKS Manual: A Strategic Approach to Managing Business Information. 2003.
National Archives of Australia. Accessed December 2005 at
http://www.naa.gov.au/recordkeeping/dirks/dirksman/dirks.html.

65
Queensland State Archives: Guideline for the Digitisation of Paper Records

Electronic Records Management Guidelines. 2004. Minnesota State Archives. Accessed


March 2005 at
http://www.mnhs.org/preserve/records/electronicrecords/erguidelinestoc.html
File Formats and Compression. 2004. Technical Advisory Service for Images. Accessed
March 2005 at http://www.tasi.ac.uk/advice/creating/fformat.html#ff2
Frey F. Guides to Quality in Visual Resource Imaging. 2000. Council on Library and
Information Resources. Accessed March 2005 at http://lyra2.rlg.org/visguides/
General Guidelines for Scanning. 1999. Colorado Digitization Project. Accessed March
2005 at
http://chnm.gmu.edu/digitalhistory/links/cached/chapter3/link3.45.CDPscanningguidelines.
html.
Gillespie J., Fair P., Lawrence A., Vaile D. Coping when Everything is Digital? Digital
Documents and Issues in Document Retention – White Paper. 2004. Baker & McKenzie
Cyberspace Law and Policy Centre, University of New South Wales. Accessed March
2005 at http://www.bakercyberlawcentre.org/ddr/
Guidelines for management, appraisal and preservation of electronic records. 1999. Public
Record Office, The National Archives (UK). Accessed March 2005 at
http://www.nationalarchives.gov.uk/electronicrecords/advice/pdf/procedures2.pdf
Hilton D. & Warr P. Unlocking Queensland’s Picture Heritage – Picture Queensland Digital
Imaging Workshop Course Notes. 2004. State Library of Queensland.
Horton S. Web Style Guide 2nd Edition: PNG Graphics. 2004. Lynch and Horton. Accessed
March 2004 at http://www.webstyleguide.com/graphics/pngs.html
How To Fix Bad Scans. 2004. Dixie State College of Utah. Accessed March 2005 at
http://cit.dixie.edu/vt/vt2600/bad_scans.asp
Imaging Best Practices. 2003. University of California, Berkley. Accessed March 2005 at
http://www.lib.berkeley.edu/digicoll/bestpractices/image_bp.html
JPEG Image Coding Standard. 1998. Centre for Telecommunications and Information
Engineering, Monash University. Accessed March 2005 at
http://www.ctie.monash.edu.au/EMERGE/multimedia/JPEG/COMM03.HTM
Leurs L. The TIFF file format. 2001 Laurens Leurs. Accessed March 2005 at
http://www.prepressure.com/formats/tiff/fileformat.htm
Ling T. 2002. Taking it to the streets: why the National Archives of Australia embraced
digitisation on demand. National Archives of Australia. Accessed March 2005 at
http://www.naa.gov.au/Publications/corporate_publications/digitising_TLing.pdf
LZW Patent Information. 2005. Unisys Corporation. Accessed March 2005 at
http://www.unisys.com/about__unisys/lzw/
Management of Electronic Records PROS 99/007 (Version 2). 2004. Public Record Office
Victoria. Accessed March 2005 at http://www.prov.vic.gov.au/vers/standard/standard
Manuscript Digitization Demonstration Project. 1998. Library of Congress.
Mendham S. JPEG 2000. 2005. IDG Communications. Accessed March 2005 at
http://www.pcworld.idg.com.au/index.php/id;1170029196;fp;2;fpid;1585691688
Metadata and Digital Images. 2004. Technical Advisory Service for Images. Accessed
March 2005 at http://tasi.ac.uk/advice/delivering/metadata.html

66
Queensland State Archives: Guideline for the Digitisation of Paper Records

Moving Theory into Practice: Digital Imaging Tutorial. 2003. Cornell University
Library/Research Department. Accessed March 2005 at
http://www.library.cornell.edu/preservation/tutorial/quality/quality-01.html
Moving Theory into Practice: Digital Imaging Tutorial. 2003. Cornell University
Library/Research Department. Accessed March 2005 at
http://www.library.cornell.edu/preservation/tutorial/quality/quality-02.html
PNG (Portable Network Graphics). 2004. World Wide Web Consortium. Accessed March
2005 at http://www.w3.org/Graphics/PNG/
The Preservation Management of Digital Material Handbook, Chapter 5: Media and
Formats. 2002. Digital Preservation Coalition. Accessed March 2005 at
http://www.dpconline.org/graphics/medfor/media.html
Quality Assurance. 2004. Technical Advisory Service for Images. Accessed March 2005 at
http://www.tasi.ac.uk/advice/creating/quality.html
Recordkeeping in Brief No. 11: Digital Imaging and Recordkeeping. 2003. State Records
New South Wales. Accessed March 2005 at
www.records.nsw.gov.au/publicsector/rk/rib/rib11.htm
Revised Digital Imaging Guidelines for State of Ohio Executive Agencies and Local
Governments. 2003. Ohio Electronic Records Committee. Accessed March 2004 at
http://www.ohiojunction.net/erc/imagingrevision/revisedimaging2003.html
Roelofs G. Multiple-image Network Graphics. 2005. Greg Roelofs. Accessed March 2005
at http://www.libpng.org/pub/mng
Rothenberg, J. Ensuring the Longevity of Digital Information. 1999. Council on Library and
Information Resources. Accessed March 2005 at
http://www.clir.org/PUBS/archives/ensuring.pdf.
Scanning Tips and Techniques. Jasc Software Inc. 1999. Accessed October 2004 at
http://www.jasc.com/tutorials/scantip.asp
Sharma A. Digital Noise, Film Grain. 2001. Digital Photo Techniques. Accessed March
2005 at http://www.phototechmag.com/sample/sharma.pdf
Suggested Technical Metadata Elements. 2004. Indiana Digital Library. Accessed March
2005 at www.statelib.lib.in.us/www/isl/diglibin/techmeta.pdf.
Tanner, S. From Vision to Implementation – strategic and management issues for digital
collections. 2000. The Electronic Library – strategic, policy and management issues
seminar. Accessed March 2005 at http://heds.herts.ac.uk/resources/papers/Lboro2000.pdf
Technical Guidelines for Digitizing Archival Materials for Electronic Access. 2004. National
Archives and Records Administration (US). Accessed March 2005 at
http://www.archives.gov/research_room/arc/arc_info/techguide_raster_june2004.pdf
Technical Recommendations for Digital Imaging Projects. 1997. Image Quality Working
Group of ArchivesCom. Accessed March 2005 at
http://www.columbia.edu/acis/dl/imagespec.html
Thornely J. The How of Metadata: Metadata Creation and Standards. 1999. 13th National
Cataloguing Conference, October 1999, Accessed March 2005 at
http://www.slq.qld.gov.au/__data/assets/file/6289/How_of_Metadata.doc
TIFF Revision 6.0. 1992. Adobe Systems Inc. Accessed March 2005 at
http://partners.adobe.com/asn/developer/pdfs/tn/TIFF6.pdf

67
Queensland State Archives: Guideline for the Digitisation of Paper Records

2003. Western States Digital Standards Group. Accessed March 2005 at


http://www.cdpheritage.org/digital/scanning/documents/WSDIBP_v1.pdf.

68

Potrebbero piacerti anche