Sei sulla pagina 1di 55

DIGITAL IMAGE PROCESSING

WORKSHOP
DAY 1
What is Computer Vision
• Computer Vision is the field of Computer Science which works
on enabling the computer to see, identify and process an
image in the same way as human eye does.
• The final objective is to extract some meaningful information
or output a decision.
• eg. :Is there an obstacle in front of the robot?
Does the image contain a shark?
How many humans are present in the given image?
Are there moving cars in the image?
What is Image Processing
• Image processing is a method to perform some operations on an
image, in order to get an enhanced image or to extract some useful
information from it.
• It is a type of signal processing in which input is an image and
output may be image or characteristics/features associated with
that image.
• Image processing basically includes the following three steps:
– Importing the image via image acquisition tools;
– Analysing and manipulating the image;
– Output image
What’s the difference
• Computer vision is related to image processing in the sense
that the computer vision front-end is comprised of image
processing techniques such as noise reduction, whitening or
image enhancement.
Image Processing : Computer Vision :
• The goal of image processing is to • The goal of computer vision is to extract
enhance or compress image/video meaningful information from
information. images/videos.
• Uses pixel-wise operations such as • Computer vision is not limited to pixel-
transforming one image into another. wise operations; it can be complex, like
For example applying a rotation on image classification, object detection
pixels. etc.
• There is no extraction of meaningful • The basic step of Computer Vision is
information from those pixel-wise processing of the image. In other words,
operations. img proc is a subset of Computer vision.
Components of an Image Processing System
Fundamental steps in Image Processing

Reference: Gonzalez, Woods


Image Sampling and Quantization
• The output of most sensors is a continuous voltage waveform
whose amplitude and spatial behavior are related to the
physical phenomenon being sensed.
• To create a digital image, we need to convert the continuous
sensed data into digital form.
• This involves two processes: sampling and quantization.
• Digitizing the coordinate values is called sampling.
• Digitizing the amplitude/ intensity values is called quantization.
y (intensity values)

Generating a digital image. (a) Continuous image. (b) A scaling line from A to B in the continuous image, used to
illustrate the concepts of sampling and quantization. (c) sampling and quantization. (d) Digital scan line.

© 2002 R. C. Gonzalez & R. E. Woods


© 2002 R. C. Gonzalez & R. E. Woods
0 0 0 75 75 75 128 128 128 128

0 75 75 75 128 128 128 255 255 255

75 75 75 200 200 200 255 255 255 200

128 128 128 200 200 255 255 200 200 200

128 128 128 255 255 200 200 200 75 75

175 175 175 225 225 225 75 75 75 100

175 175 100 100 100 225 225 75 75 100

75 75 75 35 35 35 0 0 0 35

35 35 35 0 0 0 35 35 35 75

75 75 75 100 100 100 200 200 200 200


Sampling

32
64

128

256

512

1024
Sampling

1024 512 256

128 64 32
PYTHON INTRODUCTION
What is Python
• A readable, dynamic, flexible, fast and powerful language
• Multi-purpose (Web, GUI, Scripting)
• Object oriented
• Cross Platform
• Python cares about Indentation
Indentation
One of the distinctive features of Python is its use of indentation to
mark blocks of code

In this snippet, the else part In this snippet, the else part
belongs to the 1st if statement belongs to the 2nd if statement.
Comments
• Single line comments are created simply by adding a hash
character (#) at the beginning of the line.
eg.

• Comments that span multiple lines are created by adding a


delimiter (”””) on both ends of the comment.
eg.
Data Types in Python
Python has five standard data types:
• Numbers
• String
• List
• Dictionary
• Tuple
Python sets the variable type based on the value that is assigned to it.
The variable type will be changed if the variable is set to another
value.
Data Types in Python
• Numbers - Integers, Floating point numbers, Fixed point numbers
eg. Year = 2010
pi = 3.1415
• Strings - eg. Name = “Ramesh Gokhle”

String are immutable in python i.e after defining a string variable you can explicitly replace or change
that variable i.e.
s = “Hello”
s[0] = “A” is not permitted in python.
String can be appended using ‘+’ operator
for example:
a = “Hello”, b = “World”, c = a + b, then c becomes “Hello World”
Data Types in Python
• List – A list can contain a series of values. It can be heterogeneous i.e. it can store values of different
types, such as [“A”, 1, 2, “3”, 4.0]

a) Operation b) Result
Data Types in Python
• All lists are zero-based indexed by default i.e. the first
element is present at index 0 .

• In a two dimensional array, the first number is number of


rows; the second number is number of columns.
Data Types in Python
• Dictionaries : They are lists of <Key : Value> pairs . They can be used
to hold related information. Values can be extracted on the basis of
key name; unlike lists, where index numbers are used.
Dictionaries can be used to sort, iterate and compare data.
Dictionaries are created by using braces { } with pairs separated by a comma (,)
and key values associated with a colon (:). The key must be unique.
eg.
Data Types in Python

• Tuple : Tuples are a group of values like a list and are


manipulated in similar ways. But, tuples are fixed in size once
they are assigned.
• Tuples are defined by parenthesis ().
• eg. myGroup = ('Rhino', 'Grasshopper', 'Flamingo', 'Bongo')
• Tuples have no append or extend method. Elements cannot
be removed from a tuple.
• Tuples are faster than lists.
• It makes your code safer if you “write-protect” data that does
not need to be changed.
Control Flow
• Conditionals : if, if else, elif
• For loop : for x in range (10) :
print x
• Expanded for Loop : for key, value in room_num.items() :
print (key, value)
• While Loop : x = 0
while x < 100 :
print x
x+=1
Functions
• Basic function :


Input via keyboard
• There are cases where input from the keyboard is required for program execution, for this purpose
python provides input() function, as soon as the input() function is encountered the program
execution pauses until an input is provided by the keyboard.
• Consists of one argument, the prompt text, the text that appears on the screen when the program
requests for an input.
• Default type of input is string so to convert it into another type, explicit conversion is required.

a) Program b) Result
NUMPY INTRODUCTION
What is NumPy
• NumPy is a library for Python language, standing for Numerical Python.
• It adds support for large multi-dimensional arrays and matrices; and a large
collection of high-level mathematical functions to operate these arrays.
• Operarations using NumPy :
– Mathematical and logical operations on arrays.
– Fourier transforms and routines for shape manipulation.
– Operations related to linear algebra. NumPy has inbuilt functions for linear algebra and random
number generation.
• NumPy is often used along with packages like SciPy (Scientific Python) and
Mat-plotlib (plotting library).
How to use Numpy with Python
• Standard Python distribution doesn't come bundled with
NumPy module.
• We need to install NumPy using Python Package Installer :
pip install numpy

• In order to use NumPy, we need to import it first :


import numpy as np

• The default data type in NumPy is floating point.


NumPy Arrays
• A numpy array is a grid of values, all of the same type, and is
indexed by a tuple of non-negative integers.
• The number of dimensions is the rank of the array; the shape of
an array is a tuple of integers giving the size of the array along
each dimension.
• We can initialize numpy arrays from nested Python lists, and
access elements using square brackets:
Functions to create special types of arrays
Functions to create special types of arrays
• np.arange (start, stop=None, step=1, dtype=None)
• This function is used to build a vector containing an arithmetic progression.
>>>print np.arange (0.0, 0.4, 0.2)
[0 0.1 0.2 0.3]

• numpy.mgrid=<numpy.lib.index_tricks.ndgrid object at 0x49cad28c>


nd_grid instance which returns a dense multidimensional “meshgrid”
• The dimensions and number of the >>>np.mgrid [0: 3, 0:3]
output arrays are equal to the array ( [ [ [0, 0, 0],
number of indexing dimensions. [1, 1, 1],
[2, 2, 2] ]
• If the step length is not a complex [ [0, 1, 2],
number, then the stop is not inclusive. [0, 1, 2],
[0, 1, 2] ] ] )
Array Methods

>>> arr.mean( ) >>> div_by_3.all( )


>>> np.median( arr ) >>> div_by_3.any( )
>>> arr.sum( )
>>> div_by_3.sum( )
>>> arr.std( )
>>> arr.max( ) >>> div_by_3..nonzero( )
>>> arr.min( ) >>> arr.argmin( ) #index of min
>>> arr.sort( ) >>> arr.argmax( ) #index of max
Element by element functions
>>> np.add (x1, x2)
>>> np.subtract (x1, x2)
>>> np.absolute (x)
>>> np.multiply (x1, x2)
>>> np.divide (x1, x2)
>>> np.logical_and (0, 1, 0)
>>> np.logical_or(1, 1, 0, 1)
Indexing in NumPy
• The items of an array can be accessed and assigned in the
same way as Python sequences (lists) :
>>> np.arrange (10)
array( [0, 1, 2, 3, 4, 5, 6, 7, 8, 9] )
>>> a[0], a[2], a[-1]
(0, 2, 9)
• In 2D array, 1st dimension represents rows and 2nd represents
column >>> b [2, 1] #third line, second column

• For multidimensional arrays, indices are tuple of integers


a = c[1, 2, 3]
or
a = c[(1, 2, 3)]
Slicing Arrays in NumPy
• We use the term ‘rank’ to represent the dimension of an array.
• Suppose we have a rank 2 array of shape (3, 4) :
>>> a = np.array ([[1,2,3,4], [5,6,7,8], [9,10,11,12]])
””” Use slicing to pull out the subarray consisting of the first 2 rows and
columns 1 and 2; b is the following array of shape (2, 2):
[ [2, 3]
[6, 7] ] ”””

>>> b = a[:2, 1:3]


>>> print (a[0, 1]) #prints “2”
>>> b = 77 #b[0,0] is the same piece of data as a[0,1]
>>> print (a[0, 1]) #prints “77”
Advanced Indexing
• There are two types of indexing in Python :
– Integer indexing
– Boolean indexing
• Integer array indexing allows you to construct arbitrary arrays using
the data from another array :
>>> a = np.array([[1,2], [3, 4], [5, 6]]) #creates an array of shape
>>> print(a[[0, 1, 2], [0, 1, 0]]) # prints "[1 4 5]"

• In Boolean Array Indexing :

>>> a = np.array([[1,2], [3, 4], [5, 6]])


>>> bool_idx = (a > 2) # Find the elements of a that are bigger than 2;
>>> print(bool_idx) # Prints "[[False False]
# [ True True]
# [ True True]]"
Matplotlib

• Matplotlib is a 2D plotting package. We can import


its functions as below:
>>> import matplotlib.pyplot as plt
• 1D plotting :
>>> x = np.linspace(0, 3, 20)
>>> y = np.linspace(0, 9, 20)
>>> plt.plot(x, y) # line plot
>>> plt.plot(x, y, 'o') # dot plot
>>> plt.show() # shows the plot
Matplotlib
• 2D plotting (such as image) :

>>> image = np.random.rand(30, 30)


>>> plt.imshow(image, cmap=plt.cm.gray)
>>> plt.colorbar()
>>> plt.show()
Matplotlib
• 3D plotting :
Matplotlib has only 2D display. The mplot3d toolkit adds simple 3D plotting
capabilities to matplotlib by supplying an axes object that can create a 2D projection
of a 3D scene. The resulting graph will have the same look and feel as regular 2D
plots.

One can rotate the 3D scene by simply


clicking-and-dragging the scene.

If you want to create advanced 3D scenes,


you can use MayaVi2, which is a very powerful
and featured 3d graphic library.
Mplot3d Example Code
OPENCV INTRODUCTION
What is OpenCV
 OpenCV is an Image Processing library created by Intel and maintained
by Willow Garage.

 Available on Mac, Windows, Linux

 Available for C, C++, and Python

 Open Source and free

 All the OpenCV array structures are converted to-and-from Numpy arrays.
Color Spaces in OpenCV
• The purpose of a color model (color space) is to facilitate specification of colors
in some standard way.
• Each color model represents a color by a single point in a coordinate system and
its subspace.

• RGB (red, green, blue) model is for color monitors and a broad class of color
video cameras.
• CMY (cyan, magenta, yellow) and CMYK (cyan, magenta, yellow, black) models
are for color printing.
• HSI (hue, saturation, intensity) model, which corresponds closely with the way
humans describe and interpret color. The HSI model also has the advantage that
it decouples the color and gray-scale information in an image, making it suitable
for many of the gray-scale techniques
Color Spaces in OpenCV
• Color space, also known as the color model, is an abstract mathematical model which simply
describes the range of colors as tuples of numbers.
• Some of the basic color spaces available in OpenCV are :
– BGR : In BGR, an image is treated as an additive result of three base colors (blue, green and red). The BGR ranges are:
• 0 < B < 255
• 0 < G < 255
• 0 < R < 255
Color Spaces in OpenCV
– HSV : HSV stands for Hue, Saturation and Value (Brightness). We can say that HSV is a
rearrangement of RGB in a cylindrical shape, where HUE gives the angular dimension.
– SATURATION: It tells how saturated a colour is. For e.g.- a Crimson red colour will have a high value
– of saturation compared to light red.
– VALUE: It is a component which varies with lighting condition. A particular colour place in a bright
light will have a higher VALUE component than the same colour placed in a dimmer lighting
condition.
– The HSV ranges are:
• 0 > H > 360 ⇒ OpenCV range = H/2 (0 > H > 180)
• 0 > S > 1 ⇒ OpenCV range = 255*S (0 > S > 255)
• 0 > V > 1 ⇒ OpenCV range = 255*V (0 > V > 255)
Basic Operations in OpenCV
• Reading an image
• Displaying an Image
• Writing an Image
• Reading a video stream
• Pixel operations
• Thresholding
Reading an image
• Use the function cv2.imread() to read an image.
• The image should be in the working directory or a full path of image
should be given in the first argument.
• Second argument is a flag which specifies the way image should be read.
– cv2.IMREAD_COLOR : Loads a color image. Any transparency of image will be
neglected. It is the default flag.
– cv2.IMREAD_GRAYSCALE : Loads image in grayscale mode
– cv2.IMREAD_UNCHANGED : Loads image as such including alpha channel
– Instead of these three flags, you can simply pass integers 1, 0 or -1 respectively.
• eg.: >>> import numpy as np
>>> import cv2

# Load a color image in grayscale


>>> img = cv2.imread('messi5.jpg',0)
Displaying an Image
• Use the function cv2.imshow() to display an image in a
window. The window automatically fits to the image size.
• First argument is a window name which is a string.
• Second argument is our image.
• You can create as many windows as you wish, but with
different window names.
>>> cv2.imshow('image',img)
>>> cv2.waitKey(0)
>>> cv2.destroyAllWindows()
Displaying an Image
• cv2.waitKey() is a keyboard binding function. Its argument is the time in
milliseconds.
• The function waits for specified milliseconds for any keyboard event.
• If you press any key in that time, the program continues. If 0 is passed, it
waits indefinitely for a key stroke.
• It can also be set to detect specific key strokes like,
k = cv2.waitKey(0)
if k == 27: # wait for ESC key to exit
cv2.destroyAllWindows()

• If you are using a 64-bit machine, you will have to modify k = cv2.waitKey
(0) line as cv2.waitKey (0) & 0xFF
• cv2.destroyAllWindows() simply destroys all the windows we created.
– If you want to destroy any specific window, use the function
cv2.destroyWindow() where you pass the exact window name as the argument.
Writing an Image
• Use the function cv2.imwrite() to save an image.
• First argument is the file name.
• Second argument is the image you want to save.

>>> cv2.imwrite('messigray.png',img)
Capturing Video from Camera
• You need to create a VideoCapture object.
• Its argument can be either the device index or the name of a video file.
• Device index is just the number to specify which camera. Normally one
camera will be connected. So pass 0 (or -1).
• You can select the second camera by passing 1 and so on.
Playing Video from File
Pixel Operations
• You can access a pixel value by its row and column coordinates.
• For BGR image, it returns an array of Blue, Green, Red values.
• For grayscale image, just corresponding intensity is returned.
>>> px = img[100,100]
>>> print px
[157 166 200]

# accessing only blue pixel


>>> blue = img[100,100,0]
>>> print blue
157
• You can modify the pixel values the same way.
>>> img[100,100] = [255,255,255]
>>> print img
[100,100] [255 255 255]
Splitting and Merging Channels
• Sometimes you will need to work separately on B,G,R channels
of image.
>>> b,g,r = cv2.split(img)
>>> img = cv2.merge((b,g,r))
• or
>>> b = img[:,:,0]

• Suppose, you want to make all the red pixels to zero,


>>> img[:,:,2] = 0
Additional Resources:

• https://docs.opencv.org/3.1.0/d6/d00/tutorial_py_root.html - OpenCV
• https://python-course.eu/ - Python
• Digital Image Processing – Rafael C. Gonzalez, Richard E. Woods

Potrebbero piacerti anche