Sei sulla pagina 1di 163

Machine Vision

Markus Vincze
Automation and Control Institute Technische Universitt Wien vincze@acin.tuwien.ac.at

Contents
Motivation: many projects Terminology: from computer to machine vision Components: from light to images Machine Vision Segmentation
Edge detection Grouping

Application: XPERO robot learning


2

Industrial Projects
FESTO Checkbox (inspection of parts and part recpgnition) AVL Engine Videoscope (temperature measurement in the engine during operation) IAEA/FAO Seperation of male and female Tsetse flies to fight malaria Rauscher Inspection of tampoos, quality control

Prototypes
Holzer Training-Optimisation-System (TOS)

Research Projects
CARAK Measuring 3D shape of retina FlexPaint Spray painting any industrial part

FibreScope automatic finding of bore holes REDUX follow seam of carbon mats (EADS)
5

Research Projects
RobVision Navigation in ship ActIPret Interpretation of a human handling objects MOVEMENT reliable vision indoors Kognitives Sehen understanding to see, hide and seek XPERO Learning by experimentation
6

RobVision EU Project 1998 - 2001

3D (6 DoF) Object Tracking


Model-based: Lines, ellipses Real time, 10 Hz estimate object pose Robustness: integration of model cue and image cues

RobVision

RobVision: Vision Based Navigation

RobVision

Video: Navigation in Mock-up

Vision for Robotics (V4R Tracking Tool)

Robot Navigation in Office

10

Object Recognition and Tracking


Interest points and local description of the objects appearance
Code book as compact storage of many objects Recognition: relation of interest points to object centre

11

Contents
Motivation: many projects Terminology: from computer to machine vision Components: from light to images Machine Vision Segmentation
Edge detection Grouping

Application: XPERO robot learning


12

Computer Vision
Computer Vision is a subfield of AI concerned with processing of images from the real world. Purpose: program a computer to "understand" a scene or features Methods: detection, segmentation, tracking, pose estimation, mapping to 3D model, recognition of objects (e.g., human faces) Achieved by means of pattern recognition, statistical learning, projective geometry, image processing, graph theory and other fields.
[1] Dana H. Ballard, Christopher M. Brown, (1982) Computer Vision (2nd edition), Prentice Hall.
13

Pattern Recognition
"the act of taking in raw data and taking an action based on the category of the data" [1] Methods: statistics, machine learning, ... Problem: how to determine category of data

[1] Richard O. Duda, Peter E. Hart, David G. Stork (2001) Pattern classification (2nd edition), Wiley, New York.

14

Image Processing (Bildverarbeitung)


Image to image processing [1] Subarea of computer / machine vision

Fig.: Recognising number plates;

7 Spektra of NASA LANDSAT

[1] Rafael C. Gonzales, Richard E. Woods, (2002) Digital Image Processing (2nd edition), Prentice Hall.

15

Machine Vision
= application of computer vision to factory automation. A MV system is a computer that makes decisions based on the analysis of digital images.
Light Lighting Media Object Reflection Sensor Image Processing Data Results Figure: Components of a machine vision system. Control

16

Machine Vision
Problem 1: narrow aplications Problem 2: understanding? Lesson learned: more options to control Lesson learned: consider complete system
Light Lighting Media Object Reflection Sensor Image Processing Data Results Figure: Components of a machine vision system. Control

17

Information searched for in Images


Determination of
Geometry: form, size Position and orientation (pose) Properties of materials: colour, texture, irregularities (errors)

Detect and recognise objects


Faces, persons, activities, ... Number plates, tables, mugs, industrial parts, ...

18

Contents
Motivation: many projects Terminology: from computer to machine vision Components: from light to images Machine Vision Segmentation
Edge detection Grouping Stereo images

Application: MOVEMENT robot navigation


19

From Light Source to Image

Many different influencing factors Complete modeling is impossible, yet Solutions only under very constrained conditions
20

Electromagnetic Spectrum

[Encarta]

21

Radiometry Photometry
Radiometry is the science of the measurement of illumination (light, electromagnetic radiation) Photometrie includes the aspect of how light appears to the human observer
Radiometric units matched to human eye Human average eye normed in trials of CIE (Commission Internationale de lEclairage)
22

Standard Observer

Fig.: Sensitivity of human colour vision [Padgham 1975]

Fig.: CIE Standard Observer curve [B1.19]


23

Day and Night Sight

Fig.: Spectral intensity function (efficiency) of the human eye as defined in CIE for night sight (1951) and sum daylight [B1.17]

24

Radiometric und Photometric Units

Einheiten:

Lm W Cd Sr Lx

Lumen Watt = J/s Candela Steradiant = m2/m2 Lux

25

Radiant Flux Luminous Flux


(Strahlungsfluss) Amount of radiation (photons) per second in Watt [W] (Lichtstrom) Amount of radiation perceived by standard observer in Lumen [lm]

Fig.: Spectral sensitivity of human eye defeined by CIE for day light, 1931, 1964

26

Radiant Flux Luminous Flux


Fe is the integral over the entire spectrum S

Fv weighs the radiant flux according to the intensity function of the standard observer
Fv = 683 S (Fe / ) d
380 750

Fe = S (Fe / )d
0

Example: Source wit radiant flux of 1W at a wave length of 555nm emits 683 Lumen Question: How many Lumens emits an infra red 0.7 mW LED?
27

Radiant Inten. Luminous Intensity


(Strahlstrke) (Lichtstrke) Radiated power per unit Measured in the SI of solid angle [W/sr] basic unit Candela [cd] 1 [ sr ] = 1A / 1r 2 [m 2 / m 2 ]

Solid angle (Raumwinkel) of unit sphere: 4 Fig.: Definition of the solid angle or steradians [sr] [B1.8]
28

Irradiance
Amount of radiation falling onto the inner surface of a unit sphere [W/m]

Illuminance
(Beleuchtungsstrke) Ev Visible illumination Unit: Lux [lux]

(Bestrahlungsstrke) Ee

Fig.: Definition of unit Lux [B1.21]

29

Examples for Illuminance


Lux 100,000 32,000 2,000 600 450 175 10 0.3 Description Sun light, noon Cloudy, noon Cloudy, 1 hour after sun set Illumination in supermarket Average office illumination Street light, at night Candel in 20 cm distance, pocket lamp Bright moon light, clear sky

30

Radiance Luminance
(Strahldichte) Le Emitted intensity per unit area (e.g. from object) [W/m/sr] (Leuchtdichte) Lv Response of human eye or sensor to radiance [cd/m]

Fig.: Definition of luminance [B1.24]

31

Light Source
Point source

Fig.: Intensity distribution of point source in one plane [B1.9]

32

Fore-shortening

Fig.: Reduction of intensity [lux] with distance and angle [B1.22].

Fig.: Effect of fore-shortening is handled using the cosine of tilt angle [H10.3].

33

Characteritics of Surfaces
Lambertian surface
Appears equally bright from all viewing directions All incident light is reflected

Specular surface
Mirror, all incident light is reflected

Albedo
Coefficent between 0 and 1 indicating how much light a surface reflects relative to ideal surface with no absorption Lambertian surface: albedo = .

34

Dichromatic Reflection Model


Combines Lambertian and specular model
Ee()

Fig.: Body and surface reflection.


35

Reflection Models

Fig.: Summary of reflection models [Bajscy]


36

Dichromatic Reflection Model


In general, radiance: Dichromatic model:
Le = Ee ( ) S ( )
S ( ) = S s ( )Gs ( e , e ) + Sb ( )Gb

Lambertian geometry function Gb independent of viewing angle

Fig.: Actual, typical reflectivity pattern [B3.7]

37

Geometry of a Scene

Fig.: Angles relative to object normal [H10.7].

Fig.: Definition of polar and azimuth angles [H10.6].


38

Image Formation
Ee = Le

cos 4 4 f

Assumption: is small

const.

Ee f z

Le .... radiance

Fig.: Relation between image irradiance Ee and radiance of the surface Le tacking into account the size of regions in the image corresponding to the patch on the object surface [H10.4].

Application: shape from shading

shape of surface

39

Orthographic Projection
Normal- or parallel projection: x = X, y = Y Simplification when object is far away

40

Perspective Projection
Lochmodell (pin-hole model) einer Kamera
f u= x z f v= y z
Fig.: Pinhole model.

Fig.: Basic model of imaging process.

41

Pose of Camera in 3D Space


R p Homogenous matrix: A o = 0 1
c

pc = R pi + p

Ao

Fig.: Camera and object coordinate system.

Pose estimation: possible from 3 (or more) points or lines of known object size

42

Lenses
B b 1 1 1 Gaussian lens equations: = + and = G g f b g gB Focal length: f = G+B

Fig.: The lens law [DBS].


43

Field of View, FOV


small focal length
Bmax wide FOV: = 2 atan 2f

Depth of View
Is obtained by
Small aperture Short focus length Large object distance
Fig.: Field of view angle [DBS].
44

Image Blur
Blur circle (Unschrfekreis) d ' u = a b b
d ... Aperture opening

Power of a lens = 1/f [diopters]


Fig.: Diameter d of blur circle depending on aperture [DBS]. 45

Illumination
Dark field Bright field diffuse: bell or ring light

Through light

transparent object

Cast schadow

Figs.: lighting options [DBS]

46

CCD Flchensensor

47

Farbkameras
3-Chip Kamera

[DCO]

1-Chip Kamera

Bayer Farbmaske

[DCO]

48

Aliasing
Artefacte bei Farbbergngen Farb- und Intensittskante an verschiedenen Orten

[DCO]
49

Bildaufnahme bei Bewegung


Interlaced Kamera: 2 Aufnahmen bei je 25 (30) Hz
odd field

original image

even field

interlaced image of moving object

[DCO]

Abhilfe: Full-frame Kamera


Ganzes Bild mit 25/50 Hz
50

Images
Information in images: Intensity, Colour, multispectral images, depth/distance/disparity, ...

Fig.: Intensity image (left) and range/depth image extracted from three intensity images [PointGrey]
51

Resolution, Image Pyramid

Fig.: Bild bei verschiedenen Auflsungen (120x120, 60x60, 30x30)

Fig.: Bildpyramide: Bilder wie oben nur mit gleicher Gre [G. Sandini]

52

Space-variant Images

Fig.: Images where resolution and field of view changes so that each image contains the same number of pixels

Fig.: Superimposing the above images [G. Sandini]

53

Log-polar Images, Fovea

[IBIDEM retina]

Fig.: Log-polar pixel tessellation [G. Sandini]

Advantage: similar to characteristics of human eye Efficient coding, e.g. video conferencing (64 kPixel)

Fig.: Transformation of log-polar image into a uniform, rectangular pixel tesselation [G. Sandini]

54

Log-polar Images: Applications


Fig.: Concentric circles in log-polar image project to vertical line, spokes to (nearly) horizontal lines

Fig.: When focusing on the vanishing point the radial lines (floor and ceiling to walls) project to horizontal lines [Peters, Bishay, 1996].
55

Contents
Motivation: many projects Terminology: from computer to machine vision Components: from light to images Machine Vision Segmentation
Edge detection Grouping

Application: XPERO robot learning


56

Segmentation
Partitioning the image into its constituent parts Constituent parts depend on the task
Detect object, object class, foreground/background

Stop when object of interest is found

57

Segmentation
Partitioning the image into its constituent parts Constituent parts depend on the task
Detect object, object class, foreground/background

Stop when object of interest is found

58

Gaussian Convolution
Better smoothing than mean
G ( x, y ) = 1 2
2 x2 + y2

Fig.: Original image

=1, 5x5

=2, 9x9

=4, 15x15

59

Salt and Pepper Noise


1% of Pixels either black or weight Smoothing: outliers are spread but not eliminated

Fig.: Salt and Pepper

Gauss filter =8, 5x5

Original image

60

Median Filter
Salt and Pepper (5%) Eliminates outliers (up to 50%)

Fig.: Median filter 3x3

Median filter 7x7

3x Median filter 3x3 61

Edge Detection 1D
Signal 1D = Input First derivative = gradient Second derivative (Laplace operator)

Perfect edge

Perfect spread edge

Edge with Edge with noise after filtering noise (Gauss) 62 =1

Edge Detection 2D
Fast: Roberts cross Strength:G = Gx2 + G y2 , G = Gx + G y Angle: = arctan(G y / Gx ) 3 / 4

G x = I / x

G y = I / y

Fig.: Original image

Edge strength

Binary image, threshold:80

63

Edge Detection Sobel

= arctan(G y / Gx )

Angle:

Prewitt:
Fig.: Edge strength Fig.: Original image and after Sobel 64

Edge Detection Canny


Processing steps:
1. Gaussian smoothing 2. Edge detection (e.g. Sobel) 3. Ridge following with hystereses (t1>t2)

Optimised, standard method Good compromise Thin, 1 pixel edge (ridge) Smoothing can eliminate detail
Fig.: = 1, t1=255, t2=1
65

Canny - Examples
Properties
Y-Effect: 3 edges meeting in a point are not connected Adaptive: detail and edge elements, but image dependent

Fig.: = 1, t1=255, t2=1

= 1, t1=255, t2=220

= 2, t1=255, t2=1
66

Laplacian of Gaussian - LoG


Laplacian:
2I 2I L ( x, y ) = 2 + 2 x y
2 2

1 x +y 1 LoG ( x, y ) = 4 2 2

x2 + y2 2
2

Fig.: Example of masks

Fig.: LoG or Mexican hat

Example filter 9x9, =1.4

67

LoG Example
Edges are zero crossings Ideal case: closed curves

Fig.: LoG with = 1

All zero crossings


68

LoG Edge Detection


Derivative (differencing) amplifies noise ! Better results if scaled correctly

Fig.: zero crossings at =2

Zero crossings at =3

= 1, strong zero
crossings (difference to neighbours >40) 69

Contents
Motivation: many projects Terminology: from computer to machine vision Components: from light to images Machine Vision Segmentation
Edge detection Region detection Grouping

Application: XPERO robot learning


70

Histogram
Number of pixel Number of pixel

Statistics over many pixel Robust, fast

t1

t1

t2

Intensity

Intensity

Fig.: Original image

t1=150

t1=130, t2=150

Regions (blobs)

71

Thresholds - Examples

Fig.: Original image

Histogram

Binary image for t1=120

Fig.: Original image

Histogram

Binary image for t1=80 and t1=120


72

Finding Local Thresholds


Statistics in neighborhood (e.g. 7x7) of a pixel
t = (mean value), or t = med (median) t = ( + med) / 2, or t = C (C .. constant)

Fig.: Original image

Window 7x7, C=4

Window 140x140, C=8


73

Contents
Motivation: many projects Terminology: from computer to machine vision Components: from light to images Machine Vision Segmentation
Edge detection Region detection Grouping

Application: XPERO robot learning


74

Object Recognition - Approaches


Model based
CAD Model of objects Geometric features Finding features and their relationships

Appearance based
Interest points or entire object

Gestalt Principles
Regard structure in data and features Perceptual grouping
75

Object Recognition - Approaches


Model based
CAD Model of objects Geometric features Finding features and their relationships

Appearance based
Interest points or entire object

Gestalt Principles
Regard structure in data and features Perceptual grouping
76

Relation to the State of the Art


Object Recognition = re-cognising a known object
Point features and signatures (e.g., SIFT [Lowe04])

Object categorisation = recognising objects belonging to a category (bottle, animal)


Codebooks of features (blob size, colour, circle, e.g. [Pinz07, Skocaj07, Poggio07])

Object detection = find (relevant) entities (to task)


Attention, grouping, figure-ground segmentation: task dependent methods (e.g., [Itti06] [Zillich07])

Perceived affordances = features related to actions


Learned feature clusters, 2D laser data (e.g. [Kuipers06,07]) 77

Perceptual Grouping
Idea: exploit all the known structure in the data Learn what are objects (what they mean for overall system), rather than specific objects Uses
Gestalt principles built-in knowledge levels of abstraction

78

Perceptual Grouping State of the Art


Recognition by components: A theory of human image understanding [Biederman 1987,
Dickinson, Bergevin, Biederman 1997]

Function depends on part relationship 36 geons

3D Object Recognition from single 2D images


[Lowe 1987]

Integration of regions and contours for object recognition [Schlter 2000]

79

A Computational Structure for Preattentive Perceptual Organization [Sarkar 1994]

80

Principles of Perceptual Grouping


Perceptual organization to form groupings and structures in image
Proximity Parallelism Collinearity

Probabilistic ranking 13 principles [Wilson ]


[Lowe 87]
81

Model-based Recognition

Image

Model Groupings

Edges

Reprojections

[Lowe 87]

82

Grouping, Objects and Semantics


Use Object Vision (+model) Shortcut Pixels Object Shortcut Proto Objects Gestalts Features Pixels Cognitive system
83

Classical object recognition

Vision (+phys. knowl.)

Detection from Grouping


Abstraction from features to Gestalts Hierarchical grouping Learning is possible from all features Problem: scaleability of feature search Approach: Incremental indexing [Zillich 2007]

84

Non-Accidentalness
Helmholtz principle: non-accidental groupings of features are perceived as Gestalts The less likely randomness, the more likely an underlying cause Low likelihood of a grouping = high significance

85

Poisson Process
Possible outcomes of a trial are 0 and 1 Trials are independent Expected number of occurrences in given interval = Probability that there are exactly k occurrences is

= Probability Mass/Distribution Function, discrete

k
86

Poisson Process in 2D
Randomly distributed points

87

Poisson Process in 2D
Adding some points

88

Poisson Process in 2D
How non-accidental are k points in the circle?

Interval: two dimensional or area


89

Ellipse Detection
Joint work of
Michael Zillich and Jiri Matas, CMP, Prague

Exploit local information where possible Take any edge segmented image, e.g. Canny Minimal number of parameters

90

Approach
1. Connected edgels 2. Local shape: fit circular arcs 3. Geometric constraints: convexity 4. Fit Ellipses

91

Approach
1. Connected edgels 2. Local shape: fit circular arcs 3. Geometric constraints: convexity 4. Fit Ellipses

92

Canny Edges

93

Canny Edges

94

Approach
1. Connected edgels 2. Local shape: fit circular arcs 3. Geometric constraints: convexity 4. Fit Ellipses

95

Arcs
Comparing three methods SPLIT GROW RANSAC

96

SPLIT (Rosin, West 1995)

97

SPLIT

Problem: When to stop splitting?


98

GROW

1. Locally fit cirle 2. Grow when radius similar


99

GROW

Problem: Sensitive to end points


100

RANSAC

1. Grow with random seeds 2. Take MDL optimum

101

Approach
1. Connected edgels 2. Local shape: fit circular arcs 3. Geometric constraints: convexity 4. Fit Ellipses

102

Convex Arcs

convex

non-convex

103

Convexity
Inner side of arc

Beyond endpoint of arc


104

Image Space Voting


Vote = Intersection of arc tangents/normals

105

Voted Arcs
Depends strongly on length of tangent/normal

Limited length

Unlimited length

106

Run Time Complexity

Testing all pairs: O(n2)

Image space voting: O(n)

107

Growing Groups
Extend group of arcs to form ellipse hypothesis Exhaustive search Greedy search, heuristics:
Co-curvilinearity Ellipse centers Fit quality

108

Exhaustive vs. Greedy

17 groups

1 group

109

Greedy Search Heuristics


Co-curvilinearity

Yuen et. al.,1989: 3 points on ellipse (P, Q, R) + tangents centre (C) Ellipse fit, relative support
110

True Positive Rate (TPR)

111

Runtime

112

Approach
1. Connected edgels 2. Local shape: fit circular arcs 3. Geometric constraints: convexity 4. Fit Ellipses
- B2AC [Fitzgibbon, Pilu, Fisher 1999], OpenCV

113

All Ellipses

114

Good Ellipses

115

116

TPR vs. Runtime

Office scene in different sizes 320x240 - 1280x960

117

More Structure
Lines, arcs, ellipses, parallelities, continuations, junctions, ... higher level shapes, object, symbolic? Vocabulary (Prolog) C Perceptual grouping + symbolic reasoning sentences

118

Hierarchical Perceptual Grouping


1. Connected edgels 2. Local shape: fit lines or circular arcs 3. Hierarchical grouping
L- and T-junctions Collinear lines, parallel lines Ellipses Rectangle, flaps; cylinder

119

Detection from Grouping


Abstraction from features to Gestalts Hierarchical grouping Learning is possible from all features Problem: scaleability of feature search Approach: Incremental indexing [Zillich 2007]

120

Rectangle
line(A,B) :- left(A,X), right(B,X). rect(A,B,C,D) :- line(A,B), line(B,C), line(C,D), line(D,A)

rect(A,B,C,D) :- u(A,B,C), u(C,D,A)


121

Flap

122

Book Scene

Original image

3847 edges

First 2 rectangles

All rectangles

123

Office Scene

124

Shelves scene

125

Cylinder
cyl(E,A,B) :- tangent_l(E,A), tangent_r(E,B), parallel(A,B)

126

Cup Scene

Original image

Edges

First cylinder

127

Kitchen Scene

Original image

First 3 cylinders

128

Illusion Kanizsa Square

129

Problem

Fig.: fast detection

difficult, longer time detection

130

Grouping and Incremental Indexing


Incremental growing of search lines given a processing-time Image as index space reduction of search space to image

Incremental detection of Gestalts (e.g., closures, proto-object)

Searchlines after 100ms

Searchlines after 500ms

131

Examples for Closures and Cubes


Results of incremental indexing Specify only processing-time No thresholds
Closures detected after 160, 240 and 420 ms Cubes detected from Eddy

132

VS2 Example
Vision System 2 tool for grouping

133

Finding Closures
Closed Polygon
Any size

Smaller ones better


Better ratio of edgel support over perimeter

134

Adjacency map from segments/closures polygon relationships, e.g.,


2 (blue) is_StablySupported_by 1 (violet), [barycenter within x-limits] 3 (green) is_InstablySupported_by 2 (blue), [barycenter too much right] 4 (brown) is_right_of 1 (violet),...

Ontology: Adjacency and Rule Base


3 2 1 4

Relationships are input to reasoning engine Finds concepts according to rules


2 rules for stack, one for arch, one for transitivity Processing time typically less than a second (Java)

135

Real-World Example
Segmentation

Some arches found:

Best from functional p.o.v.

Correct, yet irrelevant 136 from functional p.o.v.

Scenario - Exit
Test on door in office 378 segments, 86 closures, 35 stacks Example of two columns: Original image window arch closures best arch

137

Exit in XPERO Setup


Onotology operates on segments and closures
Results comparable

Exit - arch Closing the loop


Arch as relevant entity Learn object concepts from interaction

138

Occlusion Handling and Depth Ordering


Perceptual Grouping
Clustering based on Gestalt principles

Improved Tracking
Interest point based Lucas-Kanade algorithm Delaunay representation Local affine assumption

Spatial Reasoning
Occlusion Detection Occlusion Reasoning
139

Occlusion Handling
Only closures T-junctions indicate occlusion Occlusion indicates depth ordering of closures More occlusion more closures ordered Future work
Exploitation of motion Estimate plane normals
140

Conclusion
Same method to detect different shapes Exploit local information (smoothness, ...) Avoid local decisions and early pruning of hypotheses (thresholds) Only threshold: good ellipses Use ranking Perceptual grouping = well-defined pruning of hypotheses reduce search space Image Space Indexing: O(n2) O(n)
141

Further Work
Difficult to find a good local quality measure Global measure: combinatorial search problem Interaction with scene, mobile observer: false positives vs. stable percepts Used to initialise tracking
142

Detection and Tracking

Edges
143
143

Detection and Tracking

Arcs
144
144

Detection and Tracking

Detection
145
145

Detection and Tracking

Tracking
146
146

Detection and Tracking

147

Contents
Motivation: many projects Terminology: from computer to machine vision Components: from light to images Machine Vision Segmentation
Edge detection Region detection Grouping

Application: XPERO robot learning


148

Execute AUP Hypothesis formation by analogy Plan UVR UVR

Observe BRS
Match Observations and Predictions

Predict BRS BRS

ULJ ULJ

BRS AUP

An architecture for

Learning/ Gaining Insights/ ULJ

Educated Guess
(time series analysis, reasoning by analogy)

Knowledge Base
A priori/existing knowledge
AUP ULJ ULJ AUP

BRS ULJ BRS UVR

ULJ ULJ TUW TUW

intelligent data collection: report location of robot and object(s)

AUP

Innate knowledge

Evaluation of experiment(s) TUW

intelligent data collection/ experimental feedback (may not involve the generation of new hypotheses) TUW TUW UVR UVR

Design of experiment(s) UVR

Observation of experiment(s) TUW TUW

feedback loop on the actual execution of an experiment

Planning and execution of experiment(s)

UVR

learning by experimentation
149

UVR

BRS TUW

Showcase
Robot needs to know
Where am I and Where are relevant entities (objects)

Robot pose
Egomotion relative to environment Odometry, vision

Object pose
Basic features to describe objects Blobs: centre, size Shape: cube, ...

150

Goal: observe object and agents


Robot observes experiments Start: overhead camera Goal: robot view 3rd 1st person perspective Blobs shapes
151

Object Pose: Proto-object Detection


Objective: detect entities relevant to experiment
Goal: abstract features to groups or objects Attributes such as size, location, shape

Detection from grouping (VS3)


Incremental indexing Closures, Cube, cylinder, cone

Occlusion handling
Tracking and depth ordering

Ontology to label
152

Tracking and Grouping


Hypothesis generation from tracked Gestalts

153

Tracking and Grouping


Combining information over time Tracking through degenerate views

154

Detecting Proto-Objects Cube, Cylinder and Cone


Rule based grouping of Gestalts to obtain hypotheses Masking of proto-objects by the use of geometric features from the hypothesis

(a) edge image

(b) closures (c) proto-objects Detecting proto-objects from an edge image 155

XPERO - Ontology to Label Objects


Tackling last gap between features (proto-objects) and semantically meaningful (e.g., in the sense of function) objects Affords thoughts on the defining substrate of object concepts
spatial configuration of parts

Needs abstraction techniques arbitrary parts can take a role in a concept (e.g., being the support of another part)

156

Conclusion Machine Vision


System approach to solve a task (Passive) understanding scene is too hard
Solving a particular task in a scenario is much easier to resolve

Understanding features is under way


E.g., structure, grouping, interest points, patches

Segmentation in itself is impossible it depends on task Cameras cheap but processing difficult
157

Machine Vision Lessons Learned


Design vision system for task(s)
50% of solution is good illumination

Move out into the real world


Databases and videos are nice But reality is richer

Avoid thresholds robustness is parameter-free A lot can be done with a camera and a PC

158

EU Project robots@home
Mile stones
2007: Start 2008: Learn one room, touch screen 2009: Learn entire apartment 2010: Navigate in 4 apartments

Classify objects according to function (table, sofa, ...) Using stereo and Time-of-flight cameras
159

Recognising Furniture
Using vanishing point and 3D Group using Gestalt principles surfaces

160

Some Detected Furniture

161

Thesis Topics
Ontology for furniture Any-door detector and trying it in many homes Shape and shade (Make3D) Poke object and predict what will happen

162

Thank you

163

Potrebbero piacerti anche