Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
Markus Vincze
Automation and Control Institute Technische Universitt Wien vincze@acin.tuwien.ac.at
Contents
Motivation: many projects Terminology: from computer to machine vision Components: from light to images Machine Vision Segmentation
Edge detection Grouping
Industrial Projects
FESTO Checkbox (inspection of parts and part recpgnition) AVL Engine Videoscope (temperature measurement in the engine during operation) IAEA/FAO Seperation of male and female Tsetse flies to fight malaria Rauscher Inspection of tampoos, quality control
Prototypes
Holzer Training-Optimisation-System (TOS)
Research Projects
CARAK Measuring 3D shape of retina FlexPaint Spray painting any industrial part
FibreScope automatic finding of bore holes REDUX follow seam of carbon mats (EADS)
5
Research Projects
RobVision Navigation in ship ActIPret Interpretation of a human handling objects MOVEMENT reliable vision indoors Kognitives Sehen understanding to see, hide and seek XPERO Learning by experimentation
6
RobVision
RobVision
10
11
Contents
Motivation: many projects Terminology: from computer to machine vision Components: from light to images Machine Vision Segmentation
Edge detection Grouping
Computer Vision
Computer Vision is a subfield of AI concerned with processing of images from the real world. Purpose: program a computer to "understand" a scene or features Methods: detection, segmentation, tracking, pose estimation, mapping to 3D model, recognition of objects (e.g., human faces) Achieved by means of pattern recognition, statistical learning, projective geometry, image processing, graph theory and other fields.
[1] Dana H. Ballard, Christopher M. Brown, (1982) Computer Vision (2nd edition), Prentice Hall.
13
Pattern Recognition
"the act of taking in raw data and taking an action based on the category of the data" [1] Methods: statistics, machine learning, ... Problem: how to determine category of data
[1] Richard O. Duda, Peter E. Hart, David G. Stork (2001) Pattern classification (2nd edition), Wiley, New York.
14
[1] Rafael C. Gonzales, Richard E. Woods, (2002) Digital Image Processing (2nd edition), Prentice Hall.
15
Machine Vision
= application of computer vision to factory automation. A MV system is a computer that makes decisions based on the analysis of digital images.
Light Lighting Media Object Reflection Sensor Image Processing Data Results Figure: Components of a machine vision system. Control
16
Machine Vision
Problem 1: narrow aplications Problem 2: understanding? Lesson learned: more options to control Lesson learned: consider complete system
Light Lighting Media Object Reflection Sensor Image Processing Data Results Figure: Components of a machine vision system. Control
17
18
Contents
Motivation: many projects Terminology: from computer to machine vision Components: from light to images Machine Vision Segmentation
Edge detection Grouping Stereo images
Many different influencing factors Complete modeling is impossible, yet Solutions only under very constrained conditions
20
Electromagnetic Spectrum
[Encarta]
21
Radiometry Photometry
Radiometry is the science of the measurement of illumination (light, electromagnetic radiation) Photometrie includes the aspect of how light appears to the human observer
Radiometric units matched to human eye Human average eye normed in trials of CIE (Commission Internationale de lEclairage)
22
Standard Observer
Fig.: Spectral intensity function (efficiency) of the human eye as defined in CIE for night sight (1951) and sum daylight [B1.17]
24
Einheiten:
Lm W Cd Sr Lx
25
Fig.: Spectral sensitivity of human eye defeined by CIE for day light, 1931, 1964
26
Fv weighs the radiant flux according to the intensity function of the standard observer
Fv = 683 S (Fe / ) d
380 750
Fe = S (Fe / )d
0
Example: Source wit radiant flux of 1W at a wave length of 555nm emits 683 Lumen Question: How many Lumens emits an infra red 0.7 mW LED?
27
Solid angle (Raumwinkel) of unit sphere: 4 Fig.: Definition of the solid angle or steradians [sr] [B1.8]
28
Irradiance
Amount of radiation falling onto the inner surface of a unit sphere [W/m]
Illuminance
(Beleuchtungsstrke) Ev Visible illumination Unit: Lux [lux]
(Bestrahlungsstrke) Ee
29
30
Radiance Luminance
(Strahldichte) Le Emitted intensity per unit area (e.g. from object) [W/m/sr] (Leuchtdichte) Lv Response of human eye or sensor to radiance [cd/m]
31
Light Source
Point source
32
Fore-shortening
Fig.: Effect of fore-shortening is handled using the cosine of tilt angle [H10.3].
33
Characteritics of Surfaces
Lambertian surface
Appears equally bright from all viewing directions All incident light is reflected
Specular surface
Mirror, all incident light is reflected
Albedo
Coefficent between 0 and 1 indicating how much light a surface reflects relative to ideal surface with no absorption Lambertian surface: albedo = .
34
Reflection Models
37
Geometry of a Scene
Image Formation
Ee = Le
cos 4 4 f
Assumption: is small
const.
Ee f z
Le .... radiance
Fig.: Relation between image irradiance Ee and radiance of the surface Le tacking into account the size of regions in the image corresponding to the patch on the object surface [H10.4].
shape of surface
39
Orthographic Projection
Normal- or parallel projection: x = X, y = Y Simplification when object is far away
40
Perspective Projection
Lochmodell (pin-hole model) einer Kamera
f u= x z f v= y z
Fig.: Pinhole model.
41
pc = R pi + p
Ao
Pose estimation: possible from 3 (or more) points or lines of known object size
42
Lenses
B b 1 1 1 Gaussian lens equations: = + and = G g f b g gB Focal length: f = G+B
Depth of View
Is obtained by
Small aperture Short focus length Large object distance
Fig.: Field of view angle [DBS].
44
Image Blur
Blur circle (Unschrfekreis) d ' u = a b b
d ... Aperture opening
Illumination
Dark field Bright field diffuse: bell or ring light
Through light
transparent object
Cast schadow
46
CCD Flchensensor
47
Farbkameras
3-Chip Kamera
[DCO]
1-Chip Kamera
Bayer Farbmaske
[DCO]
48
Aliasing
Artefacte bei Farbbergngen Farb- und Intensittskante an verschiedenen Orten
[DCO]
49
original image
even field
[DCO]
Images
Information in images: Intensity, Colour, multispectral images, depth/distance/disparity, ...
Fig.: Intensity image (left) and range/depth image extracted from three intensity images [PointGrey]
51
Fig.: Bildpyramide: Bilder wie oben nur mit gleicher Gre [G. Sandini]
52
Space-variant Images
Fig.: Images where resolution and field of view changes so that each image contains the same number of pixels
53
[IBIDEM retina]
Advantage: similar to characteristics of human eye Efficient coding, e.g. video conferencing (64 kPixel)
Fig.: Transformation of log-polar image into a uniform, rectangular pixel tesselation [G. Sandini]
54
Fig.: When focusing on the vanishing point the radial lines (floor and ceiling to walls) project to horizontal lines [Peters, Bishay, 1996].
55
Contents
Motivation: many projects Terminology: from computer to machine vision Components: from light to images Machine Vision Segmentation
Edge detection Grouping
Segmentation
Partitioning the image into its constituent parts Constituent parts depend on the task
Detect object, object class, foreground/background
57
Segmentation
Partitioning the image into its constituent parts Constituent parts depend on the task
Detect object, object class, foreground/background
58
Gaussian Convolution
Better smoothing than mean
G ( x, y ) = 1 2
2 x2 + y2
=1, 5x5
=2, 9x9
=4, 15x15
59
Original image
60
Median Filter
Salt and Pepper (5%) Eliminates outliers (up to 50%)
Edge Detection 1D
Signal 1D = Input First derivative = gradient Second derivative (Laplace operator)
Perfect edge
Edge Detection 2D
Fast: Roberts cross Strength:G = Gx2 + G y2 , G = Gx + G y Angle: = arctan(G y / Gx ) 3 / 4
G x = I / x
G y = I / y
Edge strength
63
= arctan(G y / Gx )
Angle:
Prewitt:
Fig.: Edge strength Fig.: Original image and after Sobel 64
Optimised, standard method Good compromise Thin, 1 pixel edge (ridge) Smoothing can eliminate detail
Fig.: = 1, t1=255, t2=1
65
Canny - Examples
Properties
Y-Effect: 3 edges meeting in a point are not connected Adaptive: detail and edge elements, but image dependent
= 1, t1=255, t2=220
= 2, t1=255, t2=1
66
1 x +y 1 LoG ( x, y ) = 4 2 2
x2 + y2 2
2
67
LoG Example
Edges are zero crossings Ideal case: closed curves
Zero crossings at =3
= 1, strong zero
crossings (difference to neighbours >40) 69
Contents
Motivation: many projects Terminology: from computer to machine vision Components: from light to images Machine Vision Segmentation
Edge detection Region detection Grouping
Histogram
Number of pixel Number of pixel
t1
t1
t2
Intensity
Intensity
t1=150
t1=130, t2=150
Regions (blobs)
71
Thresholds - Examples
Histogram
Histogram
Contents
Motivation: many projects Terminology: from computer to machine vision Components: from light to images Machine Vision Segmentation
Edge detection Region detection Grouping
Appearance based
Interest points or entire object
Gestalt Principles
Regard structure in data and features Perceptual grouping
75
Appearance based
Interest points or entire object
Gestalt Principles
Regard structure in data and features Perceptual grouping
76
Perceptual Grouping
Idea: exploit all the known structure in the data Learn what are objects (what they mean for overall system), rather than specific objects Uses
Gestalt principles built-in knowledge levels of abstraction
78
79
80
Model-based Recognition
Image
Model Groupings
Edges
Reprojections
[Lowe 87]
82
84
Non-Accidentalness
Helmholtz principle: non-accidental groupings of features are perceived as Gestalts The less likely randomness, the more likely an underlying cause Low likelihood of a grouping = high significance
85
Poisson Process
Possible outcomes of a trial are 0 and 1 Trials are independent Expected number of occurrences in given interval = Probability that there are exactly k occurrences is
k
86
Poisson Process in 2D
Randomly distributed points
87
Poisson Process in 2D
Adding some points
88
Poisson Process in 2D
How non-accidental are k points in the circle?
Ellipse Detection
Joint work of
Michael Zillich and Jiri Matas, CMP, Prague
Exploit local information where possible Take any edge segmented image, e.g. Canny Minimal number of parameters
90
Approach
1. Connected edgels 2. Local shape: fit circular arcs 3. Geometric constraints: convexity 4. Fit Ellipses
91
Approach
1. Connected edgels 2. Local shape: fit circular arcs 3. Geometric constraints: convexity 4. Fit Ellipses
92
Canny Edges
93
Canny Edges
94
Approach
1. Connected edgels 2. Local shape: fit circular arcs 3. Geometric constraints: convexity 4. Fit Ellipses
95
Arcs
Comparing three methods SPLIT GROW RANSAC
96
97
SPLIT
GROW
GROW
RANSAC
101
Approach
1. Connected edgels 2. Local shape: fit circular arcs 3. Geometric constraints: convexity 4. Fit Ellipses
102
Convex Arcs
convex
non-convex
103
Convexity
Inner side of arc
105
Voted Arcs
Depends strongly on length of tangent/normal
Limited length
Unlimited length
106
107
Growing Groups
Extend group of arcs to form ellipse hypothesis Exhaustive search Greedy search, heuristics:
Co-curvilinearity Ellipse centers Fit quality
108
17 groups
1 group
109
Yuen et. al.,1989: 3 points on ellipse (P, Q, R) + tangents centre (C) Ellipse fit, relative support
110
111
Runtime
112
Approach
1. Connected edgels 2. Local shape: fit circular arcs 3. Geometric constraints: convexity 4. Fit Ellipses
- B2AC [Fitzgibbon, Pilu, Fisher 1999], OpenCV
113
All Ellipses
114
Good Ellipses
115
116
117
More Structure
Lines, arcs, ellipses, parallelities, continuations, junctions, ... higher level shapes, object, symbolic? Vocabulary (Prolog) C Perceptual grouping + symbolic reasoning sentences
118
119
120
Rectangle
line(A,B) :- left(A,X), right(B,X). rect(A,B,C,D) :- line(A,B), line(B,C), line(C,D), line(D,A)
Flap
122
Book Scene
Original image
3847 edges
First 2 rectangles
All rectangles
123
Office Scene
124
Shelves scene
125
Cylinder
cyl(E,A,B) :- tangent_l(E,A), tangent_r(E,B), parallel(A,B)
126
Cup Scene
Original image
Edges
First cylinder
127
Kitchen Scene
Original image
First 3 cylinders
128
129
Problem
130
131
132
VS2 Example
Vision System 2 tool for grouping
133
Finding Closures
Closed Polygon
Any size
134
135
Real-World Example
Segmentation
Scenario - Exit
Test on door in office 378 segments, 86 closures, 35 stacks Example of two columns: Original image window arch closures best arch
137
138
Improved Tracking
Interest point based Lucas-Kanade algorithm Delaunay representation Local affine assumption
Spatial Reasoning
Occlusion Detection Occlusion Reasoning
139
Occlusion Handling
Only closures T-junctions indicate occlusion Occlusion indicates depth ordering of closures More occlusion more closures ordered Future work
Exploitation of motion Estimate plane normals
140
Conclusion
Same method to detect different shapes Exploit local information (smoothness, ...) Avoid local decisions and early pruning of hypotheses (thresholds) Only threshold: good ellipses Use ranking Perceptual grouping = well-defined pruning of hypotheses reduce search space Image Space Indexing: O(n2) O(n)
141
Further Work
Difficult to find a good local quality measure Global measure: combinatorial search problem Interaction with scene, mobile observer: false positives vs. stable percepts Used to initialise tracking
142
Edges
143
143
Arcs
144
144
Detection
145
145
Tracking
146
146
147
Contents
Motivation: many projects Terminology: from computer to machine vision Components: from light to images Machine Vision Segmentation
Edge detection Region detection Grouping
Observe BRS
Match Observations and Predictions
ULJ ULJ
BRS AUP
An architecture for
Educated Guess
(time series analysis, reasoning by analogy)
Knowledge Base
A priori/existing knowledge
AUP ULJ ULJ AUP
AUP
Innate knowledge
intelligent data collection/ experimental feedback (may not involve the generation of new hypotheses) TUW TUW UVR UVR
UVR
learning by experimentation
149
UVR
BRS TUW
Showcase
Robot needs to know
Where am I and Where are relevant entities (objects)
Robot pose
Egomotion relative to environment Odometry, vision
Object pose
Basic features to describe objects Blobs: centre, size Shape: cube, ...
150
Occlusion handling
Tracking and depth ordering
Ontology to label
152
153
154
(b) closures (c) proto-objects Detecting proto-objects from an edge image 155
Needs abstraction techniques arbitrary parts can take a role in a concept (e.g., being the support of another part)
156
Segmentation in itself is impossible it depends on task Cameras cheap but processing difficult
157
Avoid thresholds robustness is parameter-free A lot can be done with a camera and a PC
158
EU Project robots@home
Mile stones
2007: Start 2008: Learn one room, touch screen 2009: Learn entire apartment 2010: Navigate in 4 apartments
Classify objects according to function (table, sofa, ...) Using stereo and Time-of-flight cameras
159
Recognising Furniture
Using vanishing point and 3D Group using Gestalt principles surfaces
160
161
Thesis Topics
Ontology for furniture Any-door detector and trying it in many homes Shape and shade (Make3D) Poke object and predict what will happen
162
Thank you
163