Sree Dip Unit1

UNIT-I
FUNDAMENTALS OF IMAGE PROCESSING
I. Digital Image and Digital Image Processing:

An image/picture/photo is a representation of some form of visual scene. A
black and white photograph records the brightness of each point in the scene,
where as a color photograph records each point’s color as well as brightness. In a
conventional photography, the mechanism by which the record is made is chemical
i.e. whether we are using black and white reel or color reel.
However to a computer, an image/picture/photograph is a large array of
numbers. Such an array is termed a “Digital Image”. “Digital Image Processing”
is nothing but application of various forms of arithmetic to the array of numbers
which are the computer’s representation of images.
II. Different types of Digital Images:

The different digital images are
1. Binary Images
2. Monochrome Images/Grey Images/Intensity Images
3. Color Images
1. Binary Images:
In binary images the array of values consists of only 0’s and 1’s. In
fact a binary image can only encode two levels, namely bright and dark.
Binary images encode the lease imaginable amount of information in each
pixel. It is common to represent binary images using just 1-bit of
information per pixel. So in each byte of storage we can store 8 pixels of the
image. By convention the pixels are grouped along the rows of the image.
The first byte holds the first 8 pixels of the first row of the image. The
second byte holds the second 8 pixels of the first row, and so on. For
practical reasons, if the number of pixels per image row is not a multiple of
‘8’, the remaining spare bits in last byte are not used to start the next row.
Each new row of the image starts with a new byte
1 1 1 0 0 1 1
0 0 1 0 0 1 1
Ex:  
1 1 1 0 0 1 1
 
1 1 1 0 1 1 1
2. Monochrome Images/Grey Images/Intensity Images:
Gray scale images records the brightness of each pixel. In gray scale images
there is some range of brightness values with some minimum value say 15
and some maximum value 250. Generally, the min. value is the dark and
max. value is the brightest pixel of the image. If each pixel can be
represented by 1-byte (8-bits), then we say for that image the range is from 0
to 255 for each pixel (0 is dark and 255 is brightest).
2 4 6 1 7
5 2 7 6 8 
Ex: 
1 5 7 8 4
 
2 5 4 1 6
N.SREEKANTH., K.S.R.M.C.E 1
3. Color Images
A color image records both the brightness and color of each pixel i.e. for
each pixel in an image it records the contents of Red, contents of Green and
contents of Blue.
The way of storing the data is dependent on the data structure that we
define. For example, in JPEG images it is different and in BMP it might be some
other way. But here our main point is that it records three colors bright nesses (i.e.
Red, Green, Blue).
The data structure can be in any of the following:
5 5 5 5 5 3 3 3 3 3 7 7 7 7 7
6 7 7 7 7  3 3 6 6 6  7 7 2 2 2
  
7 8 8 8 8 6 6 6 7 7 2 2 2 2 1
     
9 9 9 9 9 7 7 7 9 9 1 1 1 1 6
9 9 9 9 9  9 9 9 9 9  6 6 6 6 6 
Red values Green values Blue values
OR
R G B R G B …………..
5 5 5 5 5 3 3 3 3 3 7 7 7 7 7
6 7 7 7 7 3 3 6 6 6 7 7 2 2 2

7 8 8 8 8 6 6 6 7 7 2 2 2 2 1
 
9 9 9 9 9 7 7 7 9 9 1 1 1 1 6
9 9 9 9 9 9 9 9 9 9 6 6 6 6 6 
So, we can say that, if we scan an image and save that as a monochrome
image, which takes 100KB of memory, will occupy 300KB if we save that as a
color image, each 100KB for each of R, G, B.
III. Different Image examples:
IV. Image Processing Applications:
Image Processing techniques are used in a number of fields like

i). Space applications
ii). Medical Imaging
iii). Earth resource observation (where is water, mountains, forest, etc.)
iv). Astronomy
v). Geography
vi). Archeology (study of ancient civilizations by scientific analysis of
physical remains found in the ground).
vii). Biology
viii). Law enforcement
ix). Defense
x). Industrial applications
xi). Finger print matching
x). Document reading.
Space applications are like getting the image of moon and other stars and
applying some algorithm on it. The first image of the moon was taken by “Ranger
7” on July 31,1964.
The computerized axial tomography(CAT) also called computerized
tomography(CT) is one of the important events in the application of image
processing in medical diagnosis, which was invented in early 1970’s.
Computer procedures are used to enhance the contrast, and can code the
intensity levels into color. Geography uses the same or similar techniques to study
pollution patterns from aerial and satellite imagery.
In Archeology, image-processing methods have successfully restored
blurred pictures that were the only available records of rare artifacts lost or
damaged after being photographed. (‘Ayodhya’ Digging).
V. Some definitions:
Pixel / Picture Element / Pel: The smallest individual part of an image, which
can be assigned a single brightness or color, is known as “Pixel”.
Image Coordinates: By convention, digital images have their starting position or

origin at the top left corner. Position is specified by an x-coordinate, which
increases from 0 as we travel from left to right, and a y-coordinate, which
increases from 0 as we travel from top to bottom.
Contrast: A measure of the variation in brightness between the lightest and

darkest portions of a given image is known as the Contrast of that image. An
image with a maximum pixel value 100 and minimum pixel value of 50 has less
contrast than one with a maximum value of 200 and a minimum value of 42.
Resolution: This may refer to the screen size, i.e. the number of horizontal and
vertical pixels, or to the number of bits that are used to represent each pixel, giving
the number of gray levels or color levels.
(OR)
A measure of how accurately a sampled image represents the actual scene.
Resolution can be used as a spatial measure or as an indicator of how faithfully
brightness or color is represented.
VI. Fundamental steps in Image Processing:
The processing methods of an image are broadly categorized into two.

i) The methods whose I/P and O/P are images.
ii) The methods whose I/P may be images but whose outputs are attributes
extracted from the images.
This organization is summarized in Fig.3 Note that the diagram does not imply
that every process is applied to an image. Here, the intention is to convey an idea
of all the methodologies that can be applied to images for different purpose and
possible with different objectives.
i) Image acquisition :is the first process shown in Fig.
ii) Image enhancement is the simplest and most appealing areas of digital image
processing. Basically, the idea behind enhancement technique is to bring out detail
that is obscured, or simply to highlight certain features of interest in an image.
Ex:- to increase the contrast of an image, “so that it looks better “
iii) Image restoration is also deals with improving the appearance of an image.
However, unlike enhancement, which is subjective, image restoration in objectives
in the sense, that restoration technique tends to be based on mathematical or
probabilistic models of image degradation. Enhancement, on the other hand, is
based on human subjective preferences regarding what constitutes a good
enhancement result.
iv) Color Image Processing It is an area that has been gaining in importance
because of the significant increase in the use of digital images over internet. There
are number of other standards such as RGB, CMY, HSI, and HLS and so on.
v) Wavelets: is the foundation for representing images in various degrees of
resolution. Although the Fourier transform has been the mainstay of transform
based image processing since the late 1950’s, a more recent transformation, called
the “Wavelet Transform” is now making it even easier to compress, transmit and
analyze many images. Unlike the Fourier transform, whose basic functions are
sinusoids, wavelet transform are based on small waves called “wavelets” of
varying frequency and limited duration.
vi) Compression: Reducing the storage required to save an image, or bandwidth

required to transmit is known as image compression. For example, if you save an
image using the extension, as bmp will occupy more memory when you save the
same image with the extension as jpg. When you see the images in Fig.4 the first
one is occupying 800KB which is an bmp file where as the second one occupies
only 28KB which is an jpg file. But there is no much difference in the clarity.
vii) Morphological Processing: deals with tools for extracting image components
that are useful in representation and description of shape. This is the stage in which
it begins that output of the process is not the image but the output is image
attributes.
viii). Segmentation procedures partition an image into its constituent parts or

objects. In general, automatic segmentation is one of the most difficult tasks in
D.I.P. A rugged segmentation procedure brings the process a long way towards the
successful solution of imaging problems that require objects to be identified
individually.
ix). Representation and Description: almost always follow the output of a

segmentation stage, which usually is raw pixel data, constituting either the
boundary of a region (i.e. the set of pixels separating one image region from
another) or all point in the region itself. Boundary representation is appropriate
when the focus is one external shape characteristics, such as corners and
inflections. Where as regional representation is appropriate when the focus is on
internal properties such as texture or skeletal shape.
Description is also called ‘Feature selection’, deals with extracting attributes
that result in some quantitative information of interest or are basic for
differentiating one class of objects from another.
x). Recognition is the process that assign a label (ex: a car, a scooter) to an object
based on its description.
Knowledge Base: Knowledge about a problem domain is coded into an image

processing system in the form a knowledge databases.
bmp image jpeg image
emf image jpeg image
VII. Components of an image processing system:
Fig. 6
The elements of a general purpose system capable of performing the image
processing operations is shown in Fig.6.This type of system generally performs
image acquisition, image storage, image processing, communication and
image display.
Image Acquisition:
Two elements are required to acquire digital images. The first is a physical
device that is sensitive to a band in the electromagnetic energy spectrum (such as
x-ray, ultraviolet, visible or infrared bands) and that produces an electrical signal
output proportional to the level of energy scanned. The second, called a digitizer,
is a device for converting the electrical output of the physical sensing device into
digital form.
The different image acquisition equipments are video, scanners etc as
shown in Fig.2 . Although the various devices used for image acquisition vary
greatly in precision, speed and cost, many of the principles on which they operate
are common to them all. Fig.7 illustrates the general arrangement of a digitization
system, which is shown below.
At the heart of any image recording device is an optical system, which

brings an image of the scene or photograph to be digitized into sharp focus on a
sensor. The sensor converts the light falling on it into a small voltage or current.
The sensor output is buffered and amplified and if necessary converted into a
voltage suitable for sampling by the Signal Separator. This voltage is converted
into a numeric value by an analog to digital converter (ADC). The timing and
address logic ensures that the incoming signal is sampled at the appropriate time
and that the resulting value is stored at the correct position in the array that
represents the image.
Storage:
An 8-bit image of size 1024X1024 pixels requires one million bytes of
storage. Thus, providing adequate storage is usually a challenge in the design of
image processing system. Digital storage for image processing applications falls
into three categories: i. Short-term storage for use during processing. ii. Online
storage for relatively fast recall, and iii. Archival storage, characterized by
infrequent access. Storage is measured in bytes (8-bits), Kbytes, MB,GB and
TB(Tera bytes).
One method of providing short term storage is computer memory. Another
is by specialized boards, called frame buffers, that store one or more images and
can accessed rapidly, usually at video rates (30 complete images per second).On-
line storage generally takes the form of magnetic disks. Finally, archival storage is
characterized by massive storage requirements, but infrequent need for access.
Magnetic tapes and optical disks are the usual media for archival applications.
Processing:
Processing of digital images involves procedures that are usually expressed
in algorithmic form. Thus, with the exception of image acquisition and display,
most image processing functions can be implemented in software. The only reason
for specialized image processing hardware is the need for speed in some
applications or to over come some fundamental computer limitations.
Although large-scale image processing systems are still being sold for
massive imaging applications, such as processing of satellite images, the trend
continues toward miniaturizing and merging general purpose small computers
equipped with image processing hardware.
Communication:
Communication in digital image processing primarily involves local
communication between image processing systems and remote communication
from one point to another, typically in connection with the transmission of image
data. Hardware and software for local communication are readily available for
most computers. Most books on computer networks clearly explain standard
communication protocols.
Communication across vast distances presents a more serious challenge if
the intent is to communicate image data rather than abstracted results. As should
be evident by now, digital images contain a significant amount of data. A voice-
grade telephone line can transmit at a maximum rate of 9,600-bit/sec. Thus to
transmit a 512 X 512, 8-bit image at this rate would require nearly five minutes.
Wireless links using intermediate stations, such as satellites, are much faster, but
they also cost considerably more. The point is that transmission of entire images
over long distances is far from trivial.
Display:
Monochrome and color TV monitors are the principal display devices used
in modern image processing systems. Printing image display devices are useful
primarily for low-resolution image processing work. One simple approach for
generating gray-tone images directly on paper is to use the overstrike capability of
a standard line printer. The gray level of any point in the printout can be controlled
by the number and density of the characters overprinted at that point.
VIII). Different types of sensors to acquire an image:
The types of images in which we are interested are generated by the

combination of an “ illumination” source and the reflection or absorption of energy
from that source by the elements of the “scene” being imaged.
Figure.8 shows the three principle sensor arrangements used to transform
illumination energy into digital images. The idea is simple: Incoming energy is
transformed in to a voltage by the combination of input electrical power and sensor
material that is responsive to the particular type of energy being detected. The
output voltage waveform is the response of the sensor(s), and a digital quantity is
obtained from each sensor by digitizing its response.
The three different types of sensors are

1. Single imaging sensor
2. Line sensor
3. Array sensor
1. Image Acquisition using a Single Sensor:

Figure 8(a) shows the components of a single sensor. Perhaps the most
familiar sensor of this type is the photodiode, which is constructed of silicon
materials and whose output voltage waveform is proportional to light. The use of a
filter in from of a sensor improves selectivity. For example, a green (pass) filter in
front of a light sensor favors light in the green band of the color spectrum. As a
consequence, the sensor output will be stronger for green light than for other
components in the visible spectrum.
In order to generate a 2-D image using a single sensor, there has to be
relative displacements in both the x- and y- directions between the sensor and the
area to be imaged. Figure9 shows an arrangement used in high-precision scanning,
where a film negative is mounted onto a drum whose mechanical rotation provides
displacement in one dimension. The single sensor is mounted on a lead screw that
provides motion in the perpendicular direction. Since mechanical motion can be
controlled with high precision, this method is an inexpensive (but slow) way to
obtain high-resolution images. Other similar mechanical arrangements use a flat
bed, with the sensor moving in two linear directions. These types of mechanical
digitizers sometimes are referred to as microdensitometers.
2. Image Acquisition using Sensor Arrays:

A geometry that is used much more frequently than single sensors consists
of an in-line arrangement of sensors in the form of a sensor strip, as Fig8 (b)
shows. The strip provides imaging elements in one direction. Motion perpendicular
to the strip provides imaging elements in the other direction, as shown in Fig.
10(a). This is the type of arrangement used in most flat bed scanners. Sensing
devices with 4000 or more in-line sensors are possible. In-line sensors are used
routinely in airborne imaging applications, in which the imaging system is
mounted on an aircraft that flies at a constant altitude and speed over the
geographical area to be imaged. One-dimensional imaging sensor strips that
respond to carious bands of the electromagnetic spectrum are mounted
perpendicular to the direction of flight. The imaging strip gives one line of an
image at a time, and the motion of the strip completes the other dimension of a
two-dimensional image. Lenses or other focusing schemes are used to project the
area to be scanned onto the sensors.
Sensor strips mounted in a ring configuration are used in medical and

industrial imaging to obtain cross-sectional (“slice”) images of 3-D objects, as
Fig.2.14(b) shows. A rotating X-ray source provides illumination and the portion
of the sensors opposite the source collects the X-ray energy that pass through the
object (the sensors obviously have to be sensitive to X-ray energy). This is the
basis for medical and industrial computerized axial tomography (CAT) imaging as
indicated in earlier Sections 1.2 and 1.3.2. It is important to note that the output of
the sensors must be processed by reconstruction algorithms whose objective is to
transform the sensed data into meaningful cross-sectional images. In other words,
images are not obtained directly from the sensors by motion alone; they require
extensive processing. A 3-D digital volume consisting of stacked images is
generated as the object is moved in a direction perpendicular to the sensor ring.
Other modalities of imaging based on the CAT principle include magnetic
resonance imaging (MRI) and positron emission tomography (PET). The
illumination sources, sensors, and types of images are different, but conceptually
they are very similar to the basic imaging approach shown in Fig.10(b).
2.3.3Image Acquisition Using Sensor Arrays

Figure 2.12© shows individual sensors arranged in the form of a 2-D array.
Numerous electromagnetic and some ultrasonic sensing devices frequently are
arranged in an array format. This is also the predominant arrangement found in
digital cameras. A typical sensor for these cameras is a CCD array, which can be
manufactured with a broad range of sensing properties and can be packaged in
rugged arrays of 4000 X4000 elements or more. CCD sensors are used widely in
digital cameras and other light sensing instruments. The response of each sensor is
proportional to the integral of the light energy projected onto the surface of the
sensor, a property that is used in astronomical and other applications requiring low
noise images. Noise reduction is achieved by letting the sensor integrate the input
light signal over minutes or oven hours. Since the sensor array shown in Fig.2.15©
is two-dimensional, its key advantage is that a complete image can be obtained by
focusing the energy pattern onto the surface of the array. Motion obviously is not
necessary, as is the case with the sensor arrangements discussed in the preceding
two sections.
The principal manner in which array sensors are used is shown in Fig.11.
This figure shows the energy from an illumination source being reflected from a
scene element, but, as mentioned at the beginning of this section, the energy also
could be transmitted through the scene elements. The first function performed by
the imaging system shown in Fig.11(c) is to collect the incoming energy and focus
it onto an image plane. If the illumination is light, the front end of the imaging
system is a lens, which projects the viewed scene onto the lens focal plane, as
Fig.11(d) shows. The sensor array, which is coincident with the focal plane,
produces outputs proportional to the integral o0f the light received at each sensor.
Digital and analog circuitry sweep these outputs and convert them to a video
signal, which is then digitized by another section of the imaging system. The
output is a digital image, as shown diagrammatically in Fig.11 (e).
IX). Scanners: -
Scanners are used to capture the image. Scanners may be hand-held or
fixed, with either the paper being fed through the scanner or the scanner moving
across the paper. Resolution varies from 100 dpi(dots per inch) to 1000 dpi.
The main problem associated with the scanner includes the following
1. Only a still image can be captured.

2. Mechanical operation may not be reliable.
3. Hand-held operation depends on maintaining pressure and position.
X). Image Model:
The term image refers to a two-dimensional light-intensity function,

denoted by f(x, y), where the value or amplitude of f at spatial coordinates (x, y)
gives the intensity (brightness) of the image at that point. As light is a form of
energy f(x, y) must be nonzero and finite, that is,
0 < f(x, y) < ∞. ----------------1
The images people perceive in everyday visual activities normally consist

of light reflected from objects. The basic nature of f(x, y) may be characterized by
two components: (1) the amount of source light incident on the scene being viewed
and (2) the amount of light reflected by the objects in the scene. Appropriately,
they are called the illumination and reflectance components, and are denoted by
i(x, y) and r(x, y), respectively. The functions i(x, y) and r(x, y) combine as a
product to form f(x, y):
f(x, y) = i(x, y) r(x, y) --------------2
where
0 < i(x, y) < ∞ ---------------3

and
0 < r(x, y) < 1. ---------------4
Equation (4) indicates that reflectance is bounded by 0 (total absorption)

and 1 (total reflectance). The nature of i(x, y) is determined by the light source,
and r(x, y) is determined by the characteristics of the objects in a scene.
The values given in Eqs. (3) and (4) are theoretic bounds. The following
average numerical figures illustrate some typical ranges of i(x, y). On a clear day,
the sun may produce in excess of 9000 foot-candles of illumination on the surface
of the earth. This figure decreases to less than 1000 foot-candles on a cloudy day.
On a clear evening, a full moon yields about 0.01 foot-candle of illumination. The
typical illumination level in a commercial office is about 100 foot-candles.
Similarly, the following are some typical values of r(x, y): 0.01 for black velvet,
0.65 for stainless steel, 0.80 for flat-white wall paint, 0.90 for silver-plated metal,
and 0.93 for snow.
Throughout this book, we call the intensity of a monochrome image f at
coordinates (x, y) the gray level (l) of the image at that point. From Eqs.(2) through
(4), it is evident that l lies in the range
Lmin ≤ l ≤ Lmax --------------5
In theory, the only requirement on Lmin is that it be positive, and on Lmax that
it be finite. In practice, Lmin = imin rmin and Lmax = imax rmax. Using the preceding
values of illumination and reflectance as a guideline, the values Lmin ≈ 0.005 and
Lmax ≈ 100 for indoor image processing applications may be expected.
The internal [Lmin,, Lmax] is called the gray scale. Common practice is to
shift this internal numerically to the internal [0, L], where l = 0 is considered black
and l = L is considered white in the scale. All intermediate values are shades of
gray varying continuously from black to white.
Sampling and Quantization:
To be suitable for computer processing, an image function f(x, y) must be

digitized both spatially and in amplitude. Digitization of the spatial coordinates (x,
y) is called image sampling, and amplitude digitization is called gray-level
quantizaion.
Suppose that a continuous image f(x,y) is approximated by equally spaced
samples arranged in the form of an N x M array, where each element of the array is
a discrete quantity:
 f (0,0) f (0,1) . . . f (0,M−1) 

 f (1,0) f (1,1) . . . 
 
 . 
f(x, y) ≈   ----
 . 
 . 
 
 f (N−1,0) f (N−1,1) . . . f (N−1,M−1)
1
The right-hand side of Eq.(1) represents what is commonly called a digital image.
Each element of the array is referred to as an image element, picture element,
pixel, or pel. The terms image and pixels will be used throughout the following
discussions to denote a digital image and its elements.
This digitization process requires decisions about values for N, M and the
number of discrete gray levels allowed for each pixel. Common practice in digital
image processing is to let these quantities be integer powers of two; that is,
N = 2n, M = 2k --------------2
and
G=2n -----------------3
Where G denotes the number of gray levels. The assumption in this section is that
the discrete levels are equally spaced between 0 and L in the gray scale. Using Eq
(2) and (3) yields the number, b, of bits required to store a digitized image:
b = N x M x m. ---------------4
If M = N,
b= N2m. --------------------5
For example, a 128 x 128 image with 64 gray levels requires 98,304 bits of
storage.
XI). Intensity Images:
Light intensity can be translated into an electrical signal most simply by
using photosensitive cells or photosensitive resistive devices. One of these devices
can be used to make a primitive camera that generates a series of signals
representing levels of light intensity for each ‘spot’ on the picture. A system of
directing the light onto the sensitive cell is required so that the cell is looking at
each spot on the picture in turn until the whole picture has been ‘scanned’.
This was principle behind Baird’s first television(T.V.) system. The

scanning device was a circular disc with number of holes drilled in it in a spiral
fashion. Each hole would allow the light from only one spot of the picture to reach
the light sensor. As the disc rotated the hole would scan across the picture in an
spiral(arc) so that the sensor would register the light intensities in one line of
picture. When that hole has passed the sensor, another hole would appear,
presenting a slightly different arc to the sensor. Thus with ‘8’ holes it is possible to
create an ‘eight’ line image, each line being an arc. This is shown in Figure12.
This principle of repeatedly measuring light intensity until a whole image has been
scanned is still used, though with considerable sophistication.
XII). Some basic relations between pixels:
1.Neighbors of pixel and adhacencys:

Neighborhood of pixels can be defined in three ways:
i). 4-neighbors
ii). Diagonal-neighbors
iii). 8-neighbors
i). 4-neighbors:
A pixel ‘p’ at coordinates (x, y) has four horizontal and vertical neighbors
whose coordinates are given by
(x+1, y) (x-1 ,y) (x ,y+1) (x ,y-1).
This set of pixels, called the 4-neighbors of p, is denoted by N4(p). Each pixel is a
unit distance from (x, y).
ii). Diagonal-neighbors:
The four diagonal neighbors of ‘p’ have coordinates
(x+1, y+1) (x+1 ,y-1) (x-1 ,y+1) (x-1 ,y-1)
and are denoted by ND(p).
iii). 8-neighbors:
The above 4-neighbors and diagonal-neighbors together are called the 8-
neighbors of ‘p’ and is denoted by N8(p).
Adjacency:
Let V be the set of values used to define adjacency. In a binary image,
V={1} if we are referring to adjacency of pixels with value 1. In a gray scale
image, the idea is the same, but set V typically contains more elements. For
example, in the adjacency of pixels with a range of possible gray- level values 0 to
255, set V could be any subset of these 256 values.
We consider three types of adjacencies:

i).4-adjacency
ii).8-adjacency
iii).m-adjacency
i).4-adjacency: Two pixels p and q with values from V are 4- adjacent if q is in
the set
N4(p)
ii).8-adjacency: Two pixels p and q with values form V are 8-adjacent if q is in
the set
N8(p)
iii).m-adjacency (mixed adjacency): Two pixels p and q with values from v are m-
adjacent , if
a). q is in N4(p), or
b). q is in ND(p) and the set N4(p)∩ N4(q) =φ
2. Different types of Distances Measures:
If p and q are two pixels with co-ordinates (x1, y1) and (x2, y2) then we can
define three types of differences between p & q.
i. Euclidean distance i.e. ‘De’-distance.

ii. City block distance or ‘D4’-distance.
iii. Chess board distance or ‘D8’- distance.
where,
De (p, q) = ( x 2 − x1 )2 + ( y 2 − y1 )2
D4 (p, q) = x2 – x1+ y2 – y1 
D8 (p, q) = max (x2 – x1+ y2 – y1 )
Note1:- In the case of pixels having a ‘D4’ distance from (x, y) less than or equal to
some value ‘r’, form a diamond, centered at (x, y).
Ex:- The pixels with D4 distance ≤ 2 from (x, y) is shown below.
2
2 1 2
2 1 0 1 2
2 1 2
2
Note2:- In the case of pixels with ‘D8’ distance from (x, y) less than or equal to
some value ‘r’ forms a square, centered at (x, y).
Ex:- the Pixels with D8 distance ≤ 2 from (x, y) is shown below
2 2 2 2 2
2 1 1 1 2
2 1 0 1 2
2 1 1 1 2
2 2 2 2 2
3. Arithmetic/Logic Operations:-
Arithmetic and logic operations between pixels are used extensively in most
branches of image processing. The arithmetic operations between two pixels ‘p’
and ‘q’ are denoted as follows:
Addition: p+q
Subtraction: p–q
Multiplication: p*q
Division p%q
Arithmetic operations on entire images are carried out pixel by pixel.
• The main use of image addition is for images averaging to reduce noise.
• Image Subtraction is a basic tool in medical imaging, where it is used to

remove background information.
• The main use of image multiplication (or division) is to correct gray-level

shading resulting from non-uniformities in illumination or in the sensor
used to acquire the image.
Logical operations: -
The Principal (main) logical operations used in image processing are AND,
OR and COMPLEMENT. These three operations are “functionally complete” in
the sense that they can be combined to form any other logical operation.
Note that, the logical operations apply only to binary images, where as arithmetic
operations apply to multivalued pixels.
Logical operations are basic tools in binary image processing, where they
are used for tasks such as masking, feature detection, and shape analysis. Logical
operations on entire images are performed pixels by pixel. The different types of
logical operations is shown in the Fig 5. below.
4. Window operations:
In addition to pixel-by-pixel processing on entire image, arithmetic and
logical operations are used in neighborhood-oriented operations. Neighborhood
processing typically is formulated in the context of so called ‘mask operation’ (or
template, window, filter operation).
The idea behind the mask operation is to let the value assigned to a pixel be
a function of its gray level and the gray level of its neighbors. For instance
consider the subimage area shown in Fig.13, and suppose that we want to replace
the value of z5 with the average value of the pixels in a 3x3 region centered at the
pixel with value z5. To do so entails performing an arithmetic operation of the
form
1 1 9
z= ( z1 + z 2 + ........ + z 9 ) = ∑ z i
9 9 i =1
and assigning the value of z to z5.
With reference to the mask shown in Fig.6(b), the same operation can be
obtained in more general terms by centering the mask at z5 multiplying each pixel
under the mask by the corresponding coefficient, and adding the result: i.e.,
9
z = w1 z1 + w2 z 2 + LL + w9 z 9 = ∑ wi z i ------2.4-5
i =1
Equation (2.4-5) is used widely in image processing. Proper selection of the

coefficients and application of the mask at each pixel position in an image makes
possible variety of useful image operations, such as noise reduction, region
thinning, and edge detection. However, applying a mask at each pixel location in
an image is a computationally expensive task. For example, a 3x3 mask to a
512x512 image requires nine multiplications and eight additions at each pixel
location, for a total of 23,59,296 multiplications and 20,97,152 additions.
5. Coordinate operations:
Finally, Operations like moving co-ordinates along an axis, usually known
as translation, rotating them about an axis and scaling dimensions along an axis,
are very common in 3D computers graphics. The operations are usually
represented as matrices. The following matrices are used for translation, scaling
and rotation.
i). Translation by tx, ty, tz:

1 0 0 tx 
0 1 0 t y 

0 0 1 tz 
 
0 0 0 1
ii). Scaling by sx, sy & sz:
s x 0 0 0
0 sy 0 0 

0 0 sz 0
 
0 0 0 1
Then the Rotation about x-,y- and z- axis by θ are shown below respectively.
 1 0 0 0   cosθ 0 − sin θ 0 
 0 cosθ sin θ 0   0 1 0 0 
   
 0 − sin θ cosθ 0   sin θ 0 cos θ 0 
   
 0 0 0 1   0 0 0 1 
 cosθ sin θ 0 0 
− sin θ cosθ 0 0 
 
 0 0 1 0 
 
 0 0 0 1 
6. Image Zooming:
Image zooming means increasing the displaying size of the given image.
Zooming can be
i). Horizontal zooming
ii). Vertical zooming
iii). Uniform zooming.
i). Horizontal zooming:

If the given image is
4 6 2 5
2 2 4 6

3 4 6 7
 
5 4 4 5
the logic to zoom this image twice in horizontally is to place the gray value twice
in horizontally , as shown below.
4 4 6 6 2 2 5 5
2 2 2 2 4 4 6 6 

3 3 4 4 6 6 7 7
 
5 5 4 4 4 4 5 5
so that the size of the resulting array is 4x8 i.e. if mxn is the size of the given
image then the zoomed twice in horizontal means, the resulting image size
becomes mx2n.
ii). Vertical Zooming:

For the given image, the resulting array after vertical zooming is shown
below. Here, the size of the resulting array becomes 8x4, i.e. if mxn is the size of
the given image then the size of the image after zoomed twice vertically becomes
2mxn.
4 6 2 5
4 6 2 5 

2 2 4 6
 
2 2 4 6
3 4 6 7
 
3 4 6 7
5 4 4 5
 
5 4 4 5 
iii). Uniform zooming:

For the given image, the resulting array after uniform zooming twice (i.e.
the is zoomed both horizontally and vertically) is shown below. Here, the size of
the resulting array become 8x8, i.e. of mxn is the size of the given image then the
size of the image after zoomed twice uniformly becomes 2mX2n. If we zoomed it
thrice it becomes 3mX3n.
4 4 6 6 2 2 5 5
4 4 6 6 2 2 5 5 

2 2 2 2 4 4 6 6
 
2 2 2 2 4 4 6 6
3 3 4 4 6 6 7 7
 
3 3 4 4 6 6 7 7
5 5 4 4 4 4 5 5
 
5 5 4 4 4 4 5 5 
Original Horizontal zoomed image Vertical zoomed image
7. Different ways of converting the Gray image to Binary image:
To convert a gray image into a binary image, we have to choose a threshold,

and then if the gray value of the given image is greater than that threshold, we
assign a value and if less than or equal to threshold we will assign some other
value. So that the resultant image will have only two values( binary).
The different ways to choose the threshold are:
i). Threshold= Average of all the gray values of the given image.
ii). Threshold= (minimum value + maximum value)/2
iii). Threshold= Median of given gray values.
Ex: If the given image is of size 64x64, in img[64][64] then the following is the
logic to convert that into a binary image,
for(i=0;i<64;I+++)
for(j=0;j<64;j++)
if(img[i][j] > th)
img[i][j]=1
else
img[i][j]=0
All the above images are binary images.
XIV). Linear and Non-Linear Operations:

Let H be an operator whose input and output are images. H is said be a
linear operator if, for any two images f and g and any two scalars a and b,
H (af + bg )= aH(f) + bH(g).
In other words, the result of applying a linear operator to the sum of two
images (that have been multiplied by the constant shown) is identical to applying
the operator to the images individually, multiplying the results by the appropriate
constants, and then adding those results. For example, an operator whose function
is to compute the sum of K images is a linear operator. An operator that computes
the absolute value of the difference of two images is not. An operator that fails the
test of the above equation is by definition nonlinear.
XV). Color Images and Color models:
Color image processing is divided into two major areas

i). Full color image processing
ii). Pseudo color image processing
In the first category, the images are acquired with a full-color sensor, such
as a color T.V. camera or color scanner.
In the second category, the problem is one of assigning a color to a
particular monochrome intensity or range intensities.
Color fundamentals:
In 1666, Sir Issac Newton discovered that when a beam of sunlight passes
through a glass prism, the emerging beam of light is not white but consists instead
of ca continuous spectrum of colors ranging from violet at one end to red at the
other. This is shown in Fig.6.1.
As we know that the cones are the sensors in the eye responsible for color
vision. Approximately 65% of all cones are sensitive to red light, 33% are
sensitive to green light, and only 2% are sensitive to blue. Due to these absorption
characteristics of the human eye, these three colors (i.e. Red, Green and Blue) are
known as Primary colors. These colors are generally known according to their
wavelengths. The wavelength of red color is 700nm, of green color is 546.1nm and
of Blue is 435.8nm.
The primary colors can be added to produce secondary colors of light. The
secondary colors are Yellow (red plus green), Cyan (green plus blue) and Magenta
(blue plus red).
But in printing technology, Yellow, cyan and Magenta are know as primary
colors and Red, Green and Blue are know are Secondary colors. If you see the
cartridge used in your color printer we can find the cartridge consisting of Cyan,
Magenta and Yellow.
You see the diagrams in next page
It is observed from the Fig.6.4(b) the mixture of Yellow, Cyan and Magenta
, we will get Muddy Black which is not pure Black. So generally in color printers
we use two cartridges, one consists of Cyan, Magenta, Yellow colors and other
cartridge of black color.
Color Models: (HSI is the best model)

The purpose of color model is to facilitate the specification of color in some
standard, generally accepted way. In essence, a color model is a specification of a
coordinate system and a subspace with in that system where each color is
represented by a single point.
A method of specifying the exact color assigned to a given pixel is known

as color model.
The different color models are:
i. RGB
ii. CMY
iii. CMYK
iv. HIS or HSB or HSV (Hue, Saturation and Intensity/Brightness/Value)
v. HLS
vi. CIE
(i) RGB color model:

A color model (a means by which color can be mathematically described) in
which a given color is specified by the relative amounts of the three primary colors
red, green and blue. This model is based on Cartesian co-ordinate system. The
RGB color model is shown below, in which RGB values are at three corners;
Cyan, Magenta and Yellow are at three other corners; Black is at the origin and
White is at corner furthest from the origin. In this model, the gray scale extends
from black to white along the line joining these two points. The different colors in
this model are points on or inside the cube, and are defined by vectors extending
from the origin. The amount of each color is specified by a number from 0 to 255;
0,0,0 is black, while 255,255,255 is white. But for convenience, all color values
have been normalized as show in the cube i.e. all values are assumed to be in the
range [0,1].
Consider an RGB image in which each of the red, green, and blue images is
an 8-bit image. Under these conditions each RGB color pixel is said to have a
depth of 24bits (3 image planes times the number of bits per plane). The term full-
color is used to denote a 24-bit RGB color image. He total number of colors in a
24-bit color cube is (28)3=1,67,77,216. Fig.6.8 shows RGB color cube
corresponding to the above diagram.
ii). CMY and CMYK Color Models:
We know that Cyan, Magenta and Yellow are the secondary colors of light,
or alternatively, the primary colors of pigments. For example when a surface
coated with cyan pigment is illuminated with white light, no red light is reflected
from the surface i.e. cyan subtracts red light from the reflected white light, which
itself is composed of equal amounts of red, green and blue light.
Most devices that deposit colored pigments on paper, such as color
printers and copiers require CMY data input or perform an RGB to CMY
conversion internally. This conversion is performed using the simple operation as
shown below,
 C  1  R 
 M  = 1 - G  ------------------1
    
 Y  1  B 
i.e we know that R+G+B=1 (normalized)

so yellow is nothing but Red 50%, Green 50% and Blue 0%
Y = R+G+B-B
= R+G
From equation.1, we can get RGB values from a set of CMY values by subtracting
the individual CMY values from 1.
As we know, if equal amounts of color pigment primaries, i.e. Yellow, Cyan
and Magenta should produce black. In practice, combining these colors for
printing produces a muddy looking black. So in order to produce true black (which
is the predominant color in printing), a fourth color, black, is added, giving rise to
CMYK model. Thus, whenever we say four color printing, we are referring to
three colors of the CMY color model plus black gives CMYK.
(optional extra information)

XVI). Satellite Imagery:
Satellite Imagery is widely used in military, meteorological, geological and
agricultural applications. Weather satellite pictures (such as from NOAA) are
commonplace. Surveying satellites such as LANDSAT and ERTS (Earth
Resources Technology Satellites) have similar mechanical scanners, typically
scanning six horizontal scan lines at a time (with six sensors) and producing an
image of very high quality. The rotation of the earth produces the scan in the
vertical direction as show in the fig.14. Such a picture covers approximately 100 X
100 miles. Later version of the LANDSAT included a Thematic Mapper System.
This is an optical-mechanical sensor that records emitted and reflected energy
through the infrared and visible spectra with a resolution of 30 X 30 miles of land
surface per pixel.
Jensen (1986) includes a table of pixel size versus aircraft flight altitude.
Using equipment then available, at 100m an aircraft to capture an image with pixel
sizes of 2.5 m2
According to the available information, the photograph from a Soviet
KFA1000 satellite, 180 miles up, show a resolution of less than 5x5 per pixel, and
this suggests that current technology may be able to identify, from the same
altitude, the name on a lunch-box.
Image acquired by the satellite’s sensors are transmitted via a radio
frequency signal, which is picked up at a ground station and relayed to the control
station for the satellite. Signals sent from the control station via the ground station
to the satellite can be used to alter its position or the position of its sensors to allow
the desired images to be acquired.
One interesting aspect of remote sensing is that satellites often carry several
sensors. Each of these detects radiation in a different part of the spectrum, for
example infrared, visible or ultraviolet. Each detector is said to record a specific
band of the electromagnetic spectrum. The reason for recording data from multiple
bands is to increase the amount of data available for each point on the earth’s
surface. This aids the task of discriminating between the various kinds of ground
cover which can occur, making it possible to distinguish between, for example
forests, urban areas, deserts and arable farm land. Once distinguished, various
areas can be monitored to see how they are changing with time.
There is, unfortunately, little standardization in the way in which remote

sensed data is made available by the various agencies which collect it. There is a
wide variety of formats for the files of computer data which are distributed. This is
made worse by the fact that each file usually contains not only the image but also a
great deal of data related to the image. For example, the date and time when the
image was acquired, the location of the satellite and the direction in which the
sensor was pointing are vital pieces of information which will allow the image to
be analyzed later. Perhaps, eventually, some standard method of making this
information available will appear. For the time being, it is often necessary to write
special programs to allow these different data formats to be imported into various
image processing systems.
XVII). Range Images:
An example of a range image is one captured by radar. On this type of
image light intensity is not captured, but object distance from the sensor is
modeled instead. These range images are particularly useful for navigation much
more so than light intensity images. A moving vehicle needs to know the distances
of objects away from it or the distance to the lack of continuous floor, otherwise it
will bump into or fall off something. Light intensity images cannot give that
information unless distance can be estimated from prior knowledge about size, or
from a stereoscopic vision system.
Ranging Devices
The different ranging devices are
i). Ultra Sound radar
ii). Laser radar.
Ultra sound radar:
Ultra sound is widely used for short range (up to 40 m) image
collection. It is not practical for long-range image collection for a number of
reasons:
a). Insufficient transmitted energy would be detected by the receiver.
b). There is a tendency for sound to bounce more than once( double echo) if the
terrain is not conductive to ultrasound radar.
c). There may be a significant amount of ambient ultrasound present, which
makes collection of the returning signal ‘noisy’.
For more distant objects the electromagnetic spectrum has to be used. This may
mean classical radar technology or laser range finding.
ii). Laser radar

Laser radar is unaffected by the ambient noise and unlikely to be reflected
twice back to the source, thus making it a valuable ranging system. Pulses of light
are transmitted at points equivalent to pixels on the image in the direction of the
perspective ray. The transmitter is switched off and the receiver sees the increase
in the light intensity and calculates the time taken for the beam to return. Given the
speed of light, the round trip distance can be calculated. Clearly with such short
times, the time for sensor reaction is significant. This has to be eliminated by
calibration hardware.
An imaging laser radar is available commercially for airborne hydrographic
surveying. This system can measure water depths down to 40 m with an accuracy
of 0.3 m from an aerial standoff of 500 m.

Sree Dip Unit1

Caricato da

Informazioni sul documento

Copyright

Formati disponibili

Condividi questo documento

Condividi o incorpora il documento

Opzioni di condivisione

Hai trovato utile questo documento?

Questo contenuto è inappropriato?

Copyright:

Formati disponibili

Sree Dip Unit1

Caricato da

Copyright:

Formati disponibili

UNIT-I

FUNDAMENTALS OF IMAGE PROCESSING

I. Digital Image and Digital Image Processing:

II. Different types of Digital Images:

Red values Green values Blue values

III. Different Image examples:

Image Processing techniques are used in a number of fields like

Image Coordinates: By convention, digital images have their starting position or

Contrast: A measure of the variation in brightness between the lightest and

VI. Fundamental steps in Image Processing:

The processing methods of an image are broadly categorized into two.

i) Image acquisition :is the first process shown in Fig.

vi) Compression: Reducing the storage required to save an image, or bandwidth

viii). Segmentation procedures partition an image into its constituent parts or

ix). Representation and Description: almost always follow the output of a

Knowledge Base: Knowledge about a problem domain is coded into an image

emf image jpeg image

VII. Components of an image processing system:

At the heart of any image recording device is an optical system, which

The types of images in which we are interested are generated by the

The three different types of sensors are

1. Image Acquisition using a Single Sensor:

2. Image Acquisition using Sensor Arrays:

Sensor strips mounted in a ring configuration are used in medical and

2.3.3Image Acquisition Using Sensor Arrays

1. Only a still image can be captured.

The term image refers to a two-dimensional light-intensity function,

0 < f(x, y) < ∞. ----------------1

The images people perceive in everyday visual activities normally consist

f(x, y) = i(x, y) r(x, y) --------------2

0 < i(x, y) < ∞ ---------------3

0 < r(x, y) < 1. ---------------4

Equation (4) indicates that reflectance is bounded by 0 (total absorption)

Lmin ≤ l ≤ Lmax --------------5

Sampling and Quantization:

To be suitable for computer processing, an image function f(x, y) must be

 f (0,0) f (0,1) . . . f (0,M−1) 

This was principle behind Baird’s first television(T.V.) system. The

XII). Some basic relations between pixels:

1.Neighbors of pixel and adhacencys:

We consider three types of adjacencies:

2. Different types of Distances Measures:

i. Euclidean distance i.e. ‘De’-distance.

Ex:- The pixels with D4 distance ≤ 2 from (x, y) is shown below.

Ex:- the Pixels with D8 distance ≤ 2 from (x, y) is shown below

Arithmetic operations on entire images are carried out pixel by pixel.

• Image Subtraction is a basic tool in medical imaging, where it is used to

• The main use of image multiplication (or division) is to correct gray-level

and assigning the value of z to z5.

Equation (2.4-5) is used widely in image processing. Proper selection of the

i). Translation by tx, ty, tz:

ii). Scaling by sx, sy & sz:

i). Horizontal zooming:

ii). Vertical Zooming:

iii). Uniform zooming:

Original Horizontal zoomed image Vertical zoomed image

To convert a gray image into a binary image, we have to choose a threshold,

All the above images are binary images.

XIV). Linear and Non-Linear Operations:

H (af + bg )= aH(f) + bH(g).

XV). Color Images and Color models:

Color image processing is divided into two major areas

Color Models: (HSI is the best model)

A method of specifying the exact color assigned to a given pixel is known