Sei sulla pagina 1di 128

Azhar University

Faculty of Engineering


DISTRIBUTED GEOGRAPHICAL INFORMATION
SYSTEMS ON THE WEB

A THESIS SUBMITTED TO THE DEPARTMENT OF COMPUTERS AND SYSTEMS ENGINEERING IN
PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF MASTER OF SCIENCE
(COMPUTER SCEIENCE)
JUNE 2007

By
Eng. Osama Mohammed Moustafa Hosam Eldeen


Supervisors








Assoc. Prof. Dr. Reda Aboalez
Computers & Systems Engineering Department,
Faculty of Engineering, Azhar University

Prof. Dr. Mohamed Farid Zaglool
Computers & Systems Engineering Department,
Faculty of Engineering, Azhar University

Assoc. Prof. Dr. Hamdy Kelash
Computer Science & Engineering Department,
Faculty of Engineering, Monofia University







DISTRIBUTED GEOGRAPHICAL INFORMATION
SYSTEMS ON THE WEB
















Azhar University, Nasr City, Cairo, Egypt. June 2007.
Acknowledgments

All praises belong to Allah for every thing. And my peace and blessing be upon the prophet
Mohamed and all his family and companies.
During my postgraduate journey I have been humbly blessed with the support of many
people with their encouragement and advice, this dissertation could not have been a reality.
I am thankful to Allah for placing them in my life and would like to take this opportunity to
express my gratitude to them.

My sincere thanks to express my great gratitude to my honest advisors Prof. Dr. Farid
Zaglol , Ass. Prof. Dr. Reda Abo-Alez, and Ass. Prof. Dr. Hamdy Kelash for their valuable
advices and supervision, Also I introduce may thanks to all professors and doctors of the
Computers and Systems Department Faculty of Engineering, Azhar University for their
valuable help and guidance.
To Dr. Walaa Shita, The Dean of Informatics Research Institute(IRI), Mubark City for
Scientific Research and Technology Applications, I'm most grateful to him for his guidance
and help.
Thanks must be extended to professor Dr. Hytham Elmessiry, Informatics Research
Institute(IRI), Mubark City for Scientific Research and Technology Applications, for
introducing me in the field of computer vision and image processing.

I would like to express my thanks to the colleagues and friends of my work, special thanks
to Mousta A. Alkhalik, Mohamed Talat, and Emad Abbas. Also I introduce my thanks to
Shymaa Youssef, Shymaa Elleithy, and Olfat Ibrahim, for their help in our graphics Lab.

Most importantly, I would like to express my most grateful thanks to my parents, my
brother Moustafa and his wife Jihan and their kids Mohammed and Hemmat, and my
relatives Mrs. Marwa and her uncle Hanafy, for their unending support and sacrificial love
throughout my life.
Abstract
The distributed GIS on the web have become very important that almost all fields have
adopted the capability of GIS to make people able to connect, distribute geographical data
and to convey knowledge. Traditional distributed GIS systems use the geographical data in
2D, using images from satellite or aircraft. These systems have a limited use in the
distributed GIS because the need for 3D geographical data arises in many fields, for
example the field of astronomy. If it is required to take images for the surface of the moon,
it is very important to have these images with their third dimension, to distinguish for
example between the hills and valleys which are found on the surface of the moon. To add
the ability of extracting the third dimension to the satellite we need a special system, this
system can be used to convert image data from 2D to 3D. The distributed geographical data
is needed to be distributed in 3D in the Virtual Reality applications and in the remote Robot
navigation or remote surgery applications.
Any distributed GIS system contains two important parts, the GIS server and the GIS
terminal. The geographical and spatial data are saved on the GIS server; also on the GIS
Sever we find the system which manages this data. It is required to make the geographical
data available on the GIS terminal in 2D and 3D formats.
To make the geographical data available in 3D, the existed 2D images are used to get the 3D
model by process called the reconstruction process; the reconstruction is the process of
detecting the third dimension from 2D images and creating a 3D model from the image.
Approaches for detecting the third dimension of an image are shape from shading, shape
from texture, shape from motion, and shape from stereo. The most advanced method which
is used in GIS applications is the shape from stereo.
The objectives of this thesis are to make the geographical data of distributed GIS available
in 2D or 3D using the enhanced shape form stereo technique, with a database store of the
images (in pairs to be used in enhanced shape from stereo technique).
We are going to distribute geographical data with minimum amount of processing time the
time of creating 3D maps from 2D images - The creation of 3D maps using the Enhanced
III Abstract
shape from stereo depends entirely on a process called the correspondence or the matching
process. We are going to use a new approach to solve the matching or the correspondence
problem this approach is called the adaptive window approach. This approach reduces the
search time by searching for the matched objects in a specified window, the size of the
window is calculated using the image data. The word adaptive means that the size of the
window will be changed according to the image data. This is a great step in reducing the
waiting time the time needed for the user to wait until his request is processed the
waiting time is the summation of the time needed for the data to be transferred on the
network plus the processing time.
The errors of creating 3D maps form 2D images have been reduced i.e. we create 3D maps
with high accuracy. The Enhanced shape from stereo is used and the correspondence
process may include errors such as the mismatching error and the one-to-many assignment
error. The one-to-many assignment occurs when we assign a single feature in one image to
multiple features in the other image, this happens when this feature has many similar
features to match with, which make the system confused; We are going to eliminate such
errors by using the adaptive window approach.
Contents
Acknowledgments ..... I
Abstract..... II
Contents ....... IV
List of Figures .. VII
List of Tables . XI
List of Abbreviations... XII

CHPTER1: INTRODUCTION
1.1. Distributed Geographical Information Systems on the Web ...... 1
1.2. The Problem Statement ....4
1.3. Related Works ........ 7
1.4. Objectives Of The Thesis ..... 10
1.5. The Thesis Outline.11

CHAPTER 2: DISTRIBUTED GEOGRAPHICAL INFORMATION SYSTEMS
ON THE WEB
2.1 - Geographical Information Systems .... 14
2.1.1 GIS data Models................................... 16
2.1.1.1 Raster representation of data...17
2.1.1.2 Vector representation of data..19
2.2. Network Geospatial Information Systems.23
2.2.1 Real-Time GIS on Computer Networks. .23
2.2.2. Bridging the Gap between GIS and the WWW...24
2.3. Virtual Reality and GIS .25
2.3.1 Distributed Computing and Interoperability .. 26
2.3.2 Distributed Virtual Reality27
2.4. Distributed GIS for Decision Support ......... 28
2.4.1 Land Management and Crop Yield Forecasting.. 28
2.4.2 Defense Organizations .... 28
2.4.3 Emergency Services..... 29
V Contents
2.5. Summery .. 30

CHAPTER 3: IMAGE PROCESSING
3.1. Extracting image features by Image Segmentation ..31
3.1.1 Image Segmentation by Thresholding .31
3.1.2 Histogram Segmentation ........ 32
3.1.3 Edge/Line Detection ... 34
3.1.4 Identifying Objects of the image by labeling. ..35
3.2. Depth perception of 2D images .... 36
3.2.1. The Four Physiological Cues .... 36
3.2.2. The Six Psychological Cues ... 39
3.3. Shape from Stereo images.. 42
3.3.1 Intensity-based or Area-based stereo matching.... 43
3.3.2 Feature-based stereo matching.. 44
3.3.2.1 Types of features .....44
3.4. Shape from Shading ...46
3.5. Shape from Texture 47
3.6. Summery ... 49

CHAPTER 4: COMPUTER GRAPHICS
4.1. Computer Graphics Representation ........ 50
4.1.1 Anti-aliasing strategies 50
4.1.1.1 Pre-filtering. .. 51
4.1.1.2 Super-sampling. 52
4.1.2 Vector vs. Raster graphics....52
4.1.2.1 Raster Graphics ..53
4.1.2.2 Vector Graphics..53
4.2. The Geometric Model ..... 54
4.3. Rendering ..... 56
4.3.1. Projection and Rasterization ..... 56
4.3.2. Visibility .... 56
4.3.3. Shading and Materials ...... 58
VI Contents
4.3.4 Shadows and lighting simulation. 59
4.4. Summery ... 60

CHAPTER 5: SHAPE FROM STEREO SYSTEM IN DISTRIBUTED GIS
5.1. The Enhanced Shape from Stereo System ... 61
5.1.1 Stage 1 Feature-based Matching ..... 68
5.1.2. Stage 2: The Search Range ....... 78
5.1.3 Stage 3: The Adaptive Window size ... 81
5.1.4. Stage 4: Area-Based Matching... 85
5.1.5. Stage 5: The 3D Model .. 86
5.2. Experimental results . 87
5.2.1 Image Acquisition Techniques. 87
5.2.1.1 Stereo Images using The Object Registration device 87
5.2.1.2 Stereo Images Using an Aircraft: . 88
5.2.2 Results of the Shape from Stereo System 88
5.2.2.1 Feature-based Correspondence 90
5.2.2.2 Area-based Correspondence 94
5.2.3 Results of the enhanced shape from stereo system 97
5.2.3.1 Feature-based Correspondence 97
5.2.3.2 Area-based Correspondence 99
5.3. Summery 104

CHAPTER 6: CONCLUSIONS AND FUTURE WORK
6.1. Conclusions..... 105
6.1.1 Distributing Geographical Data in 2D and 3D . 105
6.1.2 Enhanced Shape from stereo system 105
6.2. Future Work.....106

References .... 108

List of figures

1.1 Distributed GIS......1
1.2 Web based GIS ......2
1.3 GIS on Computer Networks ..3
1.4 Stereo Images ....4
1.5 The motion parallax...............................................................................................5
1.6 A simple stereo image system ...6
1.7 Shape from 2D image methods..............................................................8
1.8 The thesis outline.12

2.1 GIS data layers or themes....16
2.2 Layers representation ..........17
2.3 Exhaustive representation ...18
2.4 Run-length encoding ...19
2.5 Vector Representation......20
2.6 List of coordinates ...21
2.7 Dual independent map encoding .....22
2.8 VR, Internets, GIS and there integration ... 27

3.1 Image segmentation by thresholding .....32
3.2 Image Histogram ....33
3.3 Image Histogram and Histogram segmentation .............................................................33
3.4 Image segmented using the histogram segmentation.....33
3.5 Sobel edge detector ....34
3.6 The neighborhood of the pixels 35
3.7 Image labeling ...36
3.8 Accommodation . ..37
3.9 Convergence ..... 37
3.10 Binocular disparity ..38
3.11 Retinal image size 39
3.12 Linear Perspective ....39
List of Figures VIII
3.13 Aerial Perspective.40
3.14 Overlapping .40
3.15 Shade and shadows ..41
3.16 Texture gradient .. 41
3.17 Shape from stereo 42
3.18 Shape from shading 47
3.19 Inferring the third dimension using textures ... 48
3.20 Shape from texture .. 48

4.1 The anti-aliasing 50
4.2 The anti-aliasing using the pre-filtering strategy ...51
4.3 Raster graphics ...53
4.4 Vector graphics...54
4.5 The geometric model..55
4.6 The projection and rasterization .56
4.7 The visibility ..57
4.8 The ray tracing ..57
4.9 The light reflection model .58
4.10 The material models.....59
4.11 Shadow computation using the ray tracing .....59

5.1 The structure of Distributed GIS system . 62
5.2 The Data Flow Diagram of the enhanced shape from stereo system......63
5.3 The disparity ...64
5.4 The search range ....65
5.5 Stage1 of the enhanced shape from stereo algorithm .66
5.6 Stages 2-5 of the enhanced shape from stereo algorithm ... 67
5.7 The pair of stereo images that will be used to explain our algorithm ...68
5.8 The histogram that used for segmenting the stereo image pair ...68
5.9 Dividing each image into 3 groups of regions ....69
5.10 Creating the multiresolution images for the left image .70
5.11 Creating the multiresolution images for the right image ..70
List of Figures IX
5.12 Applying morphology filter on the left image ..71
5.13 Applying morphology filter on the right image 71
5.14 Restoring the left image to its original size ..72
5.15 Restoring the right image to its original size ... 72
5.16 The optical flow ... 73
5.17 Feature-based matching ... 74
5.18 Using window in feature-based matching .... 75
5.19 The disparity . 76
5.20 Disparity map obtained from feature-based matching . 77
5.21 3D Model of the feature-based disparity map .. 78
5.22 A sample of the column index matrix .. 78
5.23 Obtaining the interval of search range . 79
5.24 A sample of the interval of search range values .. 80
5.25 Results from applying 3-levels wavelet transform ... 81
5.26 Applying the edge detector on each level of the wavelet transform results . 82
5.27 Combining the edges for each level of the wavelet transform results ..... 83
5.28 A sample of the calculated window size .. 84
5.29 The final disparity map 86
5.30 The 3D model obtained from the final disparity .. 86
5.31 The Object Registration device. 87
5.32 Aircraft Stereo images . 88
5.33 Stereo Image pairs .... 89
5.34 The problem of occlusion in stereo images... 91
5.35 The relation between the occluded regions and the accuracy .. 93
5.36 The accuracy of the feature based matching procedure ... 93
5.37 3D models after applying the feature-based stereo matching .. 94
5.38 The accuracy of the feature-based matching followed by area-based matching... 95
5.39 3D models for after applying the area-based stereo matching . 96
5.40 The Enhancements in the accuracy in feature based matching .... 98
5.41 3D models after feature-based matching with the adaptive window ... 98
5.42 The relation between the search range size and the matching accuracy .... 100
5.43: The relation between the window size and the matching accuracy .. 100
List of Figures X
5.44 The automatic calculation of search range and window size . 102
5.45 The Enhancements in the accuracy in area based matching .. 103
5.46 The final enhanced 3D models... 104





List of Tables

Table 2.1: Advantages and disadvantages of raster and vector data models .. 22
Table 5.1 : The stereo images and their properties .. 90
Table 5.2 The multiresolution images ..... 90
Table 5.3: Results of feature based matching .. 92
Table 5.4: Results of area-based matching with fixed window sizes95
Table 5.5 : Results of feature based matching with adaptive window .... 97
Table 5.6: Results of applying multiple size search range and fixed window size ..99
Table 5.7: Results of applying multiple window sizes with fixed search range.. 99
Table 5.8: The accuracy of the area-based matching with adaptive window ... 103


List of Abbreviations

BADGER : Bay Area Digital Geo Resource
CAD : Computer Aided Design
DBMS : Database Management System
DGIS : Distributed Geographical Information System
DIME: Dual Independent Map Encoding
3D : 3 Dimensions
2D : 2 Dimensions
FGDC : Federal Geographic Data Committee
FTP : File Transfer Protocol
GIS : Geographical Information System
GMS : Geostationary Meteorological Satellite
GPS : Global Position System
HTML : Hyper Text Markup Language
HTTP : Hyper Text Transfer Protocol
LAN : Local Area Network
NOAA : National Oceanic and Atmospheric Administration
RS : Remote Sensing
SR : Search Range
TNRIS : Texas Natural Resources Information System
VGIS : Virtual Geographical Information System
VR : Virtual Reality
VRGIS : Virtual Reality Geographical Information Systems
WAIS: Wide Area Information Servers
WWW : World Wide Web

CHAPTER 1
INTRODUCTION
1.1. Distributed Geographical Information Systems on the Web
As a specific type of Geographical Information Systems, Network Geospatial Information
System (GIS). It is also called Distributed GIS or Internet GIS. It has become so important
that almost all fields have adopted the capability of Distributed GIS on the web, to collect
data, to publish information, to convey and communicate knowledge [1]. Distributed GIS
has flourished as web-based GIS, independent GIS, wireless GIS, and mobile GIS.
The fast growth and widely use of Distributed GIS have brought a revolution to GIS and to
the way people communicating. Its impact is so deep that almost any information project
will include a GIS component, which rely on Distributed GIS










Distributed GIS


Distributed Geographical
Information


Distributed Information
Systems


Distributed 2D Maps
(ex. images from satellite )


Distributed 3D Maps

Figure 1.1: Distributed Geographical information systems
1. INTRODUCTION 2
As shown in Figure 1.1 we notice that Distributed GIS can be divided into Distributed
Geographical information and Information systems. The Geographical information concerns
with the land image data or maps; also called the spatial information. The Information
systems is dealing with the databases, the DBMS, spatial databases, how to restore and
retrieve data. It also concerns with how to query data.
The Geographical maps may be 2D or 3D. The 2D can be created using the CAD software
programs or scanned using digital scanner or used as image photos. 3D maps can be created
using computer modeling software or as a reconstructed 3D maps from 2D images and that
is what we are concerned with. Our aim in this thesis is to make the Geographical data
available in 2D image maps or 3D maps using the enhanced shape from stereo technique.
Shape from stereo will be explained in details in section 3.3.
One of the most important applications of distributed GIS is called the Internet GIS, as
shown in Figure 1.2. Because of the limited bandwidth of the Internet and the data traffic on
the web we need an efficient system with high speed processing time and high accuracy to
avoid the long waiting time of the internet users. The user of Distributed GIS needs to have
updated with the most recent updates of the Geographical data of the GIS system.






Transferring a 3D map in Distributed GIS takes a long time [2, 3]. Also the time of
processing the required 3D map (converting 2D images into 3D maps) is very important
since it also increases the delay time on the network. Geographical data can be transferred
as image maps from the image database directly to the user as shown in figure 1.3, or it can
Figure 1.2: Web based GIS or Internet GIS
1.1 Distributed Geographical Information Systems.
3
be displayed as 3D maps. Our objective is to make the geographical data available to the
user in 2D or 3D.
Figure 1.3 shows a diagram of a simple network GIS system. The system contains a GIS
server which contains the system of converting 2D images into 3D, and the image database
which contains the stereo image pairs [1]. It is important for the system of converting 2D
images into 3D maps using the shape from stereo technique to have a stereo pair of the map
which is required in 3D [2].
To display geographical data in 3D, there are two possible solutions:
1 - The 2D images can be processed on the GIS server and then the result (3D map) is
transferred to the GIS user, then the result is cached for further requests to the same map
[1].
2 The GIS user can retrieve the geographical data from the GIS server in 2D and then the
system for converting maps from 2D to 3D is installed on the user workstation and then the
processing will be done on the user workstation [3].








In both cases it is critical to the GIS user to see the map in 3D with minimum waiting time.
The waiting time is the time needed for the user to wait until his request for 3D map is
Figure 1.3: GIS on computer networks
1. INTRODUCTION 4
processed and the result is transferred back. Our contribution is to minimize the time of the
conversion process by using new techniques for extracting 3D data from 2D images. We
will use a method called the enhanced shape from stereo. There are a lot of methods for this
conversion (shape from shading [4], shape from texture [5], shape from motion [6], etc.),
They are all used as specific purpose applications. The shape from stereo method is used in
all applications with high accuracy and high speed of processing time.
1.2. The Problem Statement
There is a great development in the field of distributed GIS, specially the applications which
distribute image data from satellites in 3D [2, 3]. For example if it is required to take images
for the surface of the moon, it is very important to have these images with their third
dimension, to distinguish for example between the hills and valleys which are found on
the surface of the moon. To add the ability of extracting the third dimension to the satellite
we need a special system, this system can be used to convert image data from 2D to 3D.
Approaches for detecting the third dimension of an image are shape from shading [4], shape
from texture [5], shape from motion [6], and shape from stereo [7, 8, 9, 10]. The most
advanced method which is used in GIS applications is the shape from stereo.
Our work will be focused on getting the 3D Model from the 2D stereo image pairs (shape
form stereo) [7, 11, 12]. Stereo images are pair of images taken for the same view from
different positions; it looks like the human eye that see stereo as shown in figure 1.4.






Figure 1.4: The distance between repeated patterns is interpreted as depth
1.2 The Problem Statement.

Stereo images look like the human eye, since the pair of eyes takes pairs of images of the
same view. Using stereo image pair we can define the depth.
The idea behind getting 3D model from the stereo images is the concept of motion parallax ,
Assume you are looking outside of your car from the glass window you will notice that the
objects which are near the car will move faster than the objects which is far away. This shift
is called the motion parallax. If we apply the same concept on the aircraft that takes a pair of
stereo images we will find that the higher objects will move faster than the lower object. In
this case this called the x-parallax of the satellite image.
Figure 1.5 shows the concept of x-parallax (shift ). We will notice in part (A) of the figure
that the object is low, and when comparing its projection on both camera planes, the
projection of its top on the image plane will be shifted, but the shift is not obvious. In
part(B) of the figure we will notice that the projection of the top of the object on the image
planes will be shifted more than the shift in part(A), this because the image on the right
plane has the projected segment in the middle of the image plane but the image on the left
plan has the segment shifted to the right margin of the image plane [11].
We conclude that the higher objects shift in the image plane more than the lower objects.








Camera pair
Image planes
High object
Low object
Camera pair
Image planes
(A) (B)
Figure 1.5 The higher objects make more shift than the lower objects in the image plane
1. INTRODUCTION 6
From the previous declaration we found that the stereo images are a pair of images for the
same view. The objects in the image are not found in the same position in the pair of images
instead we find the objects shifted in the image. This shift is called the disparity, and to find
the disparity we have to make a correspondence or matching [13, 14, 15].
Assume we have a simple stereo system shown in figure 1.6, the point P is the physical
point, pl and pr are its projections on both left and right image planes respectively. Ol is the
left camera and Or is the right camera, f is the focal length of each camera, T is the baseline
of the camera pair, Z is the depth of the physical point P. In this system we have to solve
two problems:
For a point pl in the left image plane, determine which point in the right plane that it
corresponds to. The term correspond means that they are the images of the same
physical point P. This is what is commonly known as the correspondence problem,
the speed of the proposed matching or correspondence algorithm will enhance time,
and reducing the errors of matching will enhance accuracy [16, 17, 18, 19].
Given two corresponding points pl and pr, compute the 3-D coordinates of P relative
to some global reference frame.This is known as the reconstruction problem [20, 19].








In figure 1.6 the shift of the point pl is xl, the shift of pr is xr. The disparity d is defined as
the difference between xl and xr [12]. From the similar triangles (pl,P,pr) and (Ol,P,Or) we
have,
Figure 1.6: A simple stereo image system
1.2 The Problem Statement.



Solving 1.1 for Z we have,


From equation 1.2 we find that the depth is inversely proportional to the disparity, we
conclude that the distance objects seem to move (shift) more slowly than close ones.

1.3 Related Works:
Making geographical data available on the computer networks in 2D and 3D is a new area
of research; Methods for calculating the third dimension of an image depend on the used
image data. In case of one image we can use one of the following methods (Shape from
shading [4] and Shape from texture [5]). In case of pair of images or more we can use
(Shape from stereo [8] and Shape from motion [ 21, 11]).

The methods for deducing the third dimension from 2D images in computer vision
applications are summarized in Figure 1.7
a - Shape from Shading:
Shape from shading [4] uses the pattern of shading in a single image to infer the shape of
the surface in view. A typical example of shape from shading is astronomy, where the
technique is used to reconstruct the surface of a planet from photographs acquired by a
spacecraft.
The reason that shape can be reconstructed from shading is the link between image intensity
and surface slope. The radiance at an image point can be calculated with the surface normal,
the direction of the illumination (pointing towards the light source) and the albedo of the
surface, which is typical of the surface's material. After calculating the radiance for each
point we get the reflectance map of the image.
The parameter of the reflectance map might be unknown. In this case we have to estimate
the albedo and illuminant direction. Albedo and illuminant can be computed with help of
the averages of the image brightness and its derivatives.

(1.1)
(1.2)
1. INTRODUCTION 8


























From the reflection map and by assuming local surface smoothness, we can estimate local
surface normals, which can be integrated to give local surface shape. Shape from shading
will be explained in details in section 3.4
b - Shape from Texture:
The basic principle behind shape from texture [5] is the distortion of the individual texels
the portion of texture -. Their variation across the image allows estimating the shape of the
Figure 1.7 Shape from 2D image methods
1.2 The Problem Statement.

observed surface. The shape reconstruction exploits perspective distortion, which makes
objects far from the camera appear smaller, and foreshortening distortion, which makes
objects not parallel to the image plane shorter. The amount of both distortions can be
measured (shape distortion and distortion gradient) from an image.
To calculate the surface curvature at any point is far from trivial. Therefore, the surface
shape is reconstructed by calculating the surface orientation (surface normal). A map of
surface normals specifies the surface's orientation only at the points where the normals are
computed. But, assuming that the normals are dense enough and the surface is smooth, the
map can be used to reconstruct the surface shape.
A computer can guess the shape of a building using information from the brick texture, we
can guess the shape of a pear using the dots on its surface, Textures provide a powerful
shape cue to humans. The ability of computers to simulate this behavior would be useful in
a lot of applications such as autonomous navigation, object recognition and movie-making.
For details about shape from texture refer to section 3.5
c - Shape from Stereo:
Stereo vision refers to the ability to acquire information on the 3D structure and distance of a scene
from two or more intensity images taken from different viewpoints [7, 8].
The stereo system determines which point in one image corresponds to which point in another
image (Correspondence Problem) [9, 10, 12]. A problem is that some parts of the scene are visible
in a subset of the images only. Therefore, a stereo system must also be able to decide the image
parts that should not be matched.
After the stereo system has found pairs of corresponding image points it can start to do the
reconstruction of the scene [20, 12]. The way in which stereo determines the position in space of a
pair of image points is triangulation, that is, by measuring the difference in retinal position between
the corresponding points in the two images, known as disparity. This method needs the knowledge
on the parameters of the stereo system. There are some other methods that can be used if only some
or none of the system's parameters are available. Details about shape from stereo and the methods
for solving the correspondence problems will be explained in section 3.3
d - Shape from Motion:
In shape from motion [6] we are interested in extracting the shape of a scene from the spatial and
temporal changes occurring in an image sequence. This technique exploits the relative motion
between camera and scene. Similar to the stereo technique the process can be divided into the sub
processes finding of correspondence from consecutive frames and reconstruction of the scene.
Although, there are some important differences, The differences between consecutive frames are,
1. INTRODUCTION 10
on average, much smaller than those of typical stereo pairs, because image sequences are sampled
at high rates. Unlike stereo, in motion the relative 3D displacement between the viewing camera
and the scene is not necessarily caused by a single 3D transformation.
Regarding correspondence, the fact that motion sequences provide many closely sampled frames for
analysis is an advantage. Firstly, tracking techniques, which exploit the past history of the motion to
predict disparities in the next frame can be used. Secondly, the correspondence problem can also be
cast as the problem of estimating the apparent motion of the image brightness pattern (optical flow).
Two kinds of methods are commonly used to compute the correspondence. Differential methods use
estimates of time derivatives and require therefore image sequences sampled closely. This method
is computed at each image pixel and leads to dense measurements. Matching methods use Kalman
filtering to match and track efficiently sparse image features over time. This method is computed
only at a subset of image points and produces sparse measurements.
Unlike correspondence, reconstruction is more difficult in motion than in stereo. Frame-by-frame
recovery of motion and structure turns out to be more sensitive to noise. The reason is that the
baseline between consecutive frames is very small. For reconstruction we can use the motion field
of the image sequence. The motion field is the projection of the 3D velocity field on the image
plane. One way to acquire the 3D data is to determine the direction of translation through
approximate motion parallax. Afterwards, we can determine a least-squares approximation of the
rotational component of the optical flow and use it in the motion field equations to compute depth.

1.4. Objectives of the Thesis
In Distributed GIS we are concerned with the geographical data in both 2D and 3D. The key
factors in distributed geographical data (2D and 3D) are time and accuracy, so the objectives
of our thesis are:
1 Make the geographical data of distributed GIS available on the web in 2D or 3D, with a
database store of the images (in pairs to be used in the enhanced shape from stereo
technique). We are going to use the Enhanced shape from stereo technique to achieve this
goal.
2 - Distribute geographical data on the web with minimum amount of processing time the
time of creating 3D maps from 2D images -. The creation of 3D maps using the Enhanced
1.5 The Thesis Outline

shape from stereo depends entirely on a process called the correspondence or the matching
process. We are going to use a new approach to solve the matching or the correspondence
problem. This approach is called the adaptive window approach. This approach reduces the
search time by searching for the matched objects in a specified window; the size of the
window is calculated using the image data. The word adaptive means that the size of the
window will be changed according to the image data. This is a great step in reducing the
waiting time the time needed for the user to wait until his request is processed . The
waiting time is the summation of the time needed for the data to be transferred on the
network plus the processing time.
3 Reduce the errors of creating 3D maps form 2D images, i.e. create 3D maps with high
accuracy, the Enhanced shape from stereo is used and the correspondence process may
include errors such as the mismatching error and the one-to-many assignment error. The
one-to-many assignment occurs when we assign a single feature in one image to multiple
features in the other image, this happens when this feature has many similar features to
match with, which make the system confused; We are going to eliminate such errors by
using the adaptive window approach.
4 In the traditional shape from stereo the correspondence problem or the matching
problem is solved by searching for the matched feature in the entire image. This increases
time and reduces accuracy. Instead we used the Enhanced shape from stereo technique in
which the correspondence problem is solved by using the adaptive window technique. In
this technique we search for the required feature in a specified window. This will reduce
time and errors. This window will be adaptive, in other words its size will be changed
according to the information of the image.
1.5. The Thesis Outline
In the distributed GIS systems, we need to convert the 2D images to 3D models or maps.
The background theories of distributed GIS, image processing, and computer graphics must
be provided. The outline of our thesis is shown in figure 1.8.
Chapter 2: Introduces the background theories relating the Distributed GIS on the web. A
brief introduction to GIS will be provided followed by the raster and vector representation
1. INTRODUCTION 12
of GIS data models. Also we will show how the vector and raster data is represented in
computer graphics. The methods and algorithms for making the GIS data available in
computer networks will be introduced. We will show how the Virtual Reality, Internet and
GIS fields union to make a high functional Distributed Virtual Reality application. At the
end of this chapter, we will introduce the applications that use Distributed GIS systems for
decision support.
Chapter 3: This chapter explores the techniques for extracting the image features. The
images segmentation techniques will be introduced. We will introduce the segmentation by
thresholding, the histogram segmentation, and the edge/line segmentation. We will explain
how we can define the objects of the image. The cues of extracting the third dimension of
2D image will be explained. These cues define how human eye extract the third dimension
of the view. The stereo images will be introduced showing the methods of solving the
correspondence problem. The shape from shading will be explained showing how to extract
the third dimension using the light patterns in the image. Finally we will introduce how to
extract the third dimension using the texture of the object.














Chapter 4: This chapter introduces the background theories of computer graphics. The anti-
aliasing strategies will be introduced, showing how the effect of aliasing can be removed
using the pre-filtering and the super-sampling methods. A comparative study between

Distributed GIS
(Chapter 2)

2D GIS
Image Processing
(Chapter 3)

3D GIS
Computer Graphics
(Chapter 4)
(Convert 2D to 3D)
Our Proposed System
(Chapter 5)

Figure 1.8: The Thesis outline
1.5 The Thesis Outline

vector and raster graphics will be introduced. The geometric model will be explained.
Finally the rendering techniques will be introduced. We will introduce the projection and
rasterization methods, Also we will introduce how to define the visibility of an object, also
the effect of lights, shadows and materials on rendering the 3D scene will be introduced at
the end of this chapter.
Chapter 5: This chapter presents our proposed enhanced shape from stereo system, first we
introduce summery of the stages of our system. Then we will show the results of applying
our system on a pair of stereo images. The results of every stage will be shown in details.
We will introduce the results of applying our system on different stereo image pair. We will
make a comparative study between the traditional shape from stereo and our proposed
enhanced shape from stereo system.
Chapter 6: In this chapter we will introduce our conclusions and future work.
CHAPTER 2
DISTRIBUTED GEOGRAPHICAL INFORMATION
SYSTEMS ON THE WEB
This chapter explores the background theories and new techniques relating the Distributed
GIS on the web. In section 2.1 we will give a brief introduction to GIS. The old and the
new definitions for GIS will be explored. Then we will give a comparative study between
the Vector and Raster representation of GIS data layers and the methods for representing
them in computer graphics. The advantages and disadvantages of both vector and raster
representation of GIS data layers will be introduced. In section 2.2 we will explore the
methods and algorithms for making the GIS data available on computer networks and the
web. In section 2.3 we will introduce how the Virtual Reality, Internet, and GIS
interconnect to make a high performance Distributed Virtual Reality application. In section
2.4, a brief description of some of the specific applications that use the Distributed GIS for
decision support will be introduced. Section 2.5 gives a summery of this chapter.
2.1. Geographical Information Systems

The geographic information is information about places on the earth's surface. There is a
great amount of technologies dealing with this information like Global Position System
(GPS) which is mainly used to detect the position of an object on the earth, and Remote
Sensing (RS) is used to detect remote information about the earth, and finally Geographic
Information Systems [22].
People define the S in GIS in many different ways.
In 1980s: GIS is defined as Geographic Information Systems (GISy), technology for the
acquisition and management of spatial information software for professional users, e.g.
cartographers
In 1990s: GIS is defined as Geographic Information Science (GISc), the science (or theory
and concepts) behind the development, use, and application of geographic information
systems (GISy).
2.1 Geographical Information Systems

In 1990s: GIS is defined as Geographic Information Studies understanding the social, legal
and ethical issues associated with the application of GISy and GISc,
In 2000s: GIS is defined as Geographic Information Services Web-sites and service centers
for casual users, e.g. travelers Service (e.g., GPS, mapquest) for route planning [22].
Older definitions of GIS
The common ground between information processing and the many fields using
spatial analysis techniques.
A powerful set of tools for collecting, storing, retrieving, transforming, and
displaying spatial data from the real world.
A computerized database management system for the capture, storage, retrieval,
analysis and display of spatial (defined by location) data.
A decision support system involving the integration of spatially referenced data in a
problem solving environment.
We can define GIS now as A system of hardware, software, and procedures designed to
support the capture, management, manipulation, analysis, modeling and display of
spatially-referenced data (located on the earths surface) for solving complex planning and
management problems.
Examples of GIS Data
Urban Planning, Management (Land acquisition, Economic development, Housing
renovation programs, Emergency response, Crime analysis).
Environmental Sciences (Monitoring environmental risk, Modeling storm water
runoff, Management of watersheds, floodplains, wetlands, forests, Environmental
Impact Analysis).
Political Science (Analysis of election results, Predictive modeling)
Civil Engineering/Utility (Locating underground facilities, Coordination of
infrastructure maintenance)
Business (Demographic Analysis, Market Penetration/ Share Analysis)
2.1.1 GIS Data models
The simple definition of GIS is "a map with data behind it", and the purpose is to allow the
geographic features in real world locations to be digitally represented and stored in a
database so that they can be abstractly presented in map form, and can also be worked with
2. DITRIBUTED GEOGRAPHICAL INFORMATION SYSTEM 16
and manipulated to address some problem. GIS data model is based on data layers or themes
Figure 2.1 [22, 23].














A data is organized by layers or themes, with each theme representing a common feature,
Layers are integrated using explicit location on the earths surface, thus geographic location
is the organizing principal.
Examples of layers or themes are roads, hydrology (water), topography (land elevation)
Layers are comprised of two data types:
1 - Spatial data which describes location (where) it can be stored in a shape file in ArcView
for example.
2 - Attribute data specifying what, how much, when and can be stored in a database table

GIS systems traditionally maintain spatial and attribute data separately, then join them for
display or analysis (for example, in ArcView). The spatial data of layer can be represented
in two ways
1 - in raster (image) format as pixels.
2 - in vector format as points, lines and areas (PLA-model), See Figure 2.2


Figure 2.1: GIS data layers or themes
2.1 Geographical Information Systems


















2.1.1.1 - Raster representation of data
Raster is a method for the storage, processing and display of spatial data [22, 23]. Each area
is divided into rows and columns, which forms a regular grid structure. Each cell must be
rectangular in shape, but not necessarily square. Each cell within this matrix contains
location co-ordinates as well as an attribute value. The spatial location of each cell is
implicitly contained within the ordering of the matrix, unlike a vector structure which stores
topology explicitly. Areas containing the same attribute value are recognized as such,
however, raster structures cannot identify the boundaries of such areas as polygons.
Raster data is an abstraction of the real world where spatial data is expressed as a matrix of
cells or pixels see Figure 2.3, with spatial position implicit in the ordering of the pixels.
With the raster data model, spatial data is not continuous but divided into discrete units.
This makes raster data particularly suitable for certain types of spatial operation, for
example overlays or area calculations. Raster structures may lead to increased storage in
certain situations, since they store each cell in the matrix regardless of whether it is a feature
or simply 'empty' space. A pixel is the contraction of the words picture element. Commonly
used in remote sensing to describe each unit in an image, in raster GIS the pixel equivalent
is usually referred to as a cell element or grid cell. Pixel/cell refers to the smallest unit of
information available in an image or raster map. This is the smallest element of a display
Figure 2.2: Layers representation
2. DITRIBUTED GEOGRAPHICAL INFORMATION SYSTEM 18
device that can be independently assigned attributes such as color. Raster data structures can
be classified into exhaustive representation and run-length encoding.
a - Exhaustive representation:
In this data structure every pixel is given a single value, Figure 2.3, hence there is no
compression when many similar values are encountered [23].














b - Run-length encoding

It is a raster image compression technique [23], Figure 2.4. If a raster contains groups of
cells with identical values, run length encoding can compress storage. Instead of storing
each cell, each component stores a value and a count of cells with that value. If there is only
one cell the storage doubles, but for three or more cells there is a reduction. The longer and
more frequent the consecutive values are, the greater the compression that will be achieved.
This technique is particularly useful for encoding monochrome images or binary images.



Figure 2.3: Exhaustive representation
2.1 Geographical Information Systems








2.1.1.2 - Vector representation of data
Vector is a data structure, used to store spatial data [22, 23]. Vector data is comprised of
lines or arcs, defined by beginning and end points, which meet at nodes. The locations of
these nodes and the topological structure are usually stored explicitly.
Features are defined by their boundaries only and curved lines are represented as a series of
connecting arcs. Vector storage involves the storage of explicit topology, which raises
overheads. However it only stores those points which define a feature and all space outside
these features is 'non-existent'.
A vector based GIS is defined by the vector representation of its geographic data. According
with the characteristics of this data model, geographic objects are explicitly represented and,
within the spatial characteristics, the thematic aspects are associated.
There are different ways of organizing this double database (spatial and thematic). Usually,
vector-based systems are composed of two components, the one that manages spatial data
and the one that manages thematic data.
This is the named hybrid organization system, as it links a relational database for the
attributes with a topological one for the spatial data. A key element in these kinds of
systems is the identifier of every object. This identifier is unique and different for each
object and allows the system to connect both data bases.


Figure 2.4: Run-length encoding
2. DITRIBUTED GEOGRAPHICAL INFORMATION SYSTEM 20







In the vector based model, Figure 2.5, geospatial data is represented in the form of co-
ordinates. In vector data, the basic units of spatial information are points, lines, arcs, and
polygons. Each of these units is composed simply as a series of one or more co-ordinate
points, for example, a line is a collection of related points, and a polygon is a collection of
related lines.
Co-ordinates
The coordinates are pairs of numbers expressing horizontal distances along orthogonal
axes, or triplets of numbers measuring horizontal and vertical distances, or n-numbers
along n-axes expressing a precise location in n-dimensional space. Co-ordinates
generally represent locations on the earth's surface relative to other locations.
Point
A point is a zero-dimensional abstraction of an object represented by a single X,Y co-
ordinate. A point normally represents a geographic feature too small to be displayed as a
line or area; for example, the location of a building location on a small-scale map, or the
location of a service cover on a medium scale map.
Line
A line is a set of ordered co-ordinates that represent the shape of geographic features too
narrow to be displayed as an area at the given scale (contours, street centerlines, or
streams), or linear features with no area (county boundary lines). A line is synonymous
with an arc.
Figure 2.5: Vector representation
2.1 Geographical Information Systems

Arc
The Arc is an ARC/INFO term that is used synonymously with line.
Polygon
The polygon is a feature used to represent areas. A polygon is defined by the lines that
make up its boundary and a point inside its boundary for identification. Polygons have
attributes that describe the geographic feature they represent.
There are different models to store and manage vector information. Each of them has
different advantages and disadvantages.
a - List of coordinates "spaghetti"
It is a simple and easy to manage vector model [23], Figure 2.6, it has lots of duplication,
hence need for large storage space, it also has no topology, very often used in CAC
(computer assisted cartography)







b- Dual Independent Map Encoding (DIME)
It is a model developed by US Bureau of the Census [23], Figure 2.7, it contains nodes
intersection of lines which are identified with codes, it assigns a directional code in the form
of a "from node" and a "to node".

Figure 2.6: List of coordinates "spaghetti"











Table 2.1 lists the Advantages/disadvantages of raster and vector data models

raster vector
precision in graphics

traditional cartography

data volume

topology

computation

update

continuous space

integration

discontinuous

Table 2.1: Advantages and disadvantages of raster and vector data models
Figure 2.7: Dual Independent Map Encoding (DIME) format
2.2 Network Geospatial Information Systems

2.2. Network Geospatial Information Systems


The amount of digital geospatial data available is rapidly growing. In particular, there is a
vast amount of data from earth observation satellites, and next-generation satellites are
expected to produce terabytes of data per day. This presents a challenge for the development
of computer systems that enable the storage, management of these huge data sets in online
data archives or digital libraries. Ideally, such a system would provide efficient, on-demand
remote access to these data sets over the Internet (or an intranet), so that authorized users
could easily access and utilize the data for a variety of Geographic Information Systems
(GIS) applications, including decision support, research and other analysis [1].
2.2.1 Real-Time GIS on Computer Networks.
For a number of GIS applications, such as those requiring real-time or interactive analysis
of large data products such as satellite imagery, the processing requirements are large
enough that high-performance computer servers are required. This leads to the concept of an
"active" digital library, where the server provides not only services for querying and
downloading of data from the library, but also services for processing the data before
downloading [1].
This approach is particularly useful if the amount of data to be processed is very large, for
example multiple channels of a satellite image, but the final result is relatively small, for
example a processed satellite image for a localized area, or perhaps just a few numbers such
as average sea temperature or percentage cloud cover or some correlation coefficients. If the
data is obtained from the server using a wide-area, relatively low-bandwidth network, it will
be more efficient if the user only has to download the final results rather than download the
large input data set and process it locally. Many decision support applications that
manipulate spatial data involve operations on very large data sets, but carry out data
reduction operations to provide summarized information to the end user. Some of the data
sets may be remotely accessed from different servers, possibly over wide-area networks,
and the processing may be done on yet another machine, possibly a high-performance
computer or supercomputer. We have investigated the consequences of connecting together
resources for fast mass storage and high-performance computing with broadband networks,
2. DITRIBUTED GEOGRAPHIC INFORMATION SYSTEM 24
whereas the user's client computer may only have modest network capabilities, such as a
modem link via the World Wide Web.
2.2.2. Bridging the Gap between GIS and the WWW
Geographic Information Systems (GIS) such as Arc/Info and GRASS can contain rich and
diverse information such as census data, business data, highway networks, and geologic
data. Information is stored in formats including tables, images, and text. Typical GIS
software maintains, manipulates, and displays this information in dynamic and graphical
ways. The traditional mode of operation is to have the data accessible from within a specific
GIS application on a single platform [24, 1].
With the recent wide-spread acceptance and availability of World Wide Web (WWW)
technology, it is natural to look toward building linkages between the two distinct worlds of
GIS and the WWW in order to make the wealth of information available in the GIS world
publicly available through the WWW.
Internet-based information systems have seen a tremendous growth in the past few years,
e.g., FTP (File Transfer Protocol), Gopher, WAIS (Wide Area Information Servers), and
WWW (World Wide Web). The WWW is gaining wide acceptance due in part to its flexible
capabilities for transferring multi-media information and its integration with many existing,
successful network tools. WWW employs a client-server computing paradigm in which the
client requests the information, the server provides the data in multi-media and hyper-media
formats such as text, audio, video, and links to other information sites. Today, almost all
information on the Internet is accessible from the WWW.
Due to the very nature of GIS, which requires specialized operations on stored data,
information in a GIS is not readily available to the WWW community. WWW servers and
browsers today use the hyper-text transfer protocol (HTTP) and the hyper-text markup
language (HTML) which operate on text and a relatively small set of pre-defined types of
images. HTML in general does not support the features required by a GIS, nor are they
appropriate, given that HTML is a hypertext language.
2.3 Virtual Reality and GIS

A number of projects have focused on making GIS information readily available for WWW
users in order to close the gap between the two communities. BADGER (Bay Area Digital
GeoResource) is a project that provides geographic information in the San Francisco Bay
area in digital format accessible through the WWW. BADGER generates a sequence of
digital maps of different details, and users can view or download the images as needed. The
U.S. government has made efforts to standardize catalog repositories of datasets so that
users can browse, evaluate, and order them efficiently. As part of this effort, the Federal
Geographic Data Committee (FGDC) has published its geospatial metadata standard. At the
state level, similar projects are underway. For example, the Texas Natural Resources
Information System (TNRIS) has made available on the WWW a library of remotely sensed
data, a set of topographic maps, and other information describing land use, vegetation, and
hydrology. The Texas General Land Office lists land and minerals information, as well as
environmental and economic information on the U.S./Mexico border.
2.3. Virtual Reality and GIS
The concept of merging Geographic Information Systems and Virtual Reality into Virtual
Geographic Information Systems (VGIS) has enjoyed increasing attention in the recent past
[25]. Virtual GIS has been defined as a highly integrated, efficient real time 3D GIS for
visualizing geographic data. In the journey from a 2D map to a more interactive 3D, GIS
has no doubt served the user community well, but there is an increasing demand for better
data handling and visualization using the recent developments such as Virtual Reality. On
the other hand, Virtual Reality applications are also looking for potential applications.
The 3D GIS, which has come into play with decidedly more interactive rates for high
resolution display than a 2D GIS, has large amounts of data that can be expected to grow by
a factor of 100 (3D textures, photo textures, etc.). This makes visualization of data more
difficult; the rendering algorithms have to be optimized to load only data that is actually
visible. Due to the performance demands, the increased complexity and hierarchical
organization of data, relational databases are no longer suitable to manage a 3D GIS
database. VGIS manages its huge, complex terrain and GIS data sets at real time rates in an
efficient manner by using hierarchical spatial data structures. Virtual Reality adds an
important freedom for the user to visualize and interpret spatial data more effectively. A real
time visual simulation in VGIS supports the accurate depiction of terrain elevation and
2. DITRIBUTED GEOGRAPHIC INFORMATION SYSTEM 26
imagery, in addition to features such as ground cover and trees, buildings, and static objects,
roads, and atmospheric effects, thus adding new dimensions to the concept of simulation of
real life situations. It adds a new dimension in the visualization of abstract variables (e.g.,
environmental variables such as pollution level) by reducing the level of abstraction. Virtual
Reality improves the communication of ideas and concepts in a collaborative process. In the
GIS realm, the goal is to support users who are overwhelmingly map illiterate. Here,
VRGIS acts as a mediator and transmitter of ideas between participants. Thus, the marriage
of the two technologies promises the much-needed relief for the demanding user. So
researchers around the world, both at academic and industry levels, are feverishly working
towards trying to integrate these two technologies together.
2.3.1 Distributed Computing and Interoperability
The principle of distributed computing is to make more than one machine work on the same
problem simultaneously [25], thereby reaching the desired result faster and more reliably
than if relying on a single machine to produce the result. Some of the prerequisites that must
be met are:
The individual computers must be interconnected, preferably with high-speed
networks
There must exist an infrastructure to handle distribution and communication
It must not be overly complicated for programmers to use the distributed system
Interoperability is the idea that different developers, working almost entirely independently,
can contribute software components to a common, quality-assured collection (eg.
repositories) and those components can be easily obtained from this collection and easily
combined into larger assemblies using a variety of interconnection mechanisms. The cost /
barriers /risks of interoperability will take into account:
Large data set and multi-source data
Performance
Remote collaboration
2.3 Virtual Reality and GIS

What these technologies promise in a GIS context is that they will allow GIS users to build
applications that integrate software components from different developers and from
different places. Open GIS Consortium has emerged as a major force in the trend toward
openness, and also as a consortium of GIS vendors, agencies and academic institutions. In
Interoperability of GIS, the common theme has always been simplicity, transparency and
similarity.
2.3.2 Distributed Virtual Reality
The idea behind distributed VR is very simple; a simulated world runs not on one computer
system, but on several. The computers are connected over a network (possibly the global
Internet), and people using those computers are able to interact in real time, sharing the
same virtual world. Figure 2.8 shows the integration between GIS, Internet and Virtual
Reality [25, 3].
In theory, people can be sitting at home in London, Paris, and New York, all interacting in
a meaningful way in VR. Each user in a persistent collaborative virtual environment is
represented by an avatar, so that users at other sites in the same virtual environment know
where they are and what they are looking at. A collaborative VR application sends tracker
information of the user in the VR across the network; at the same time, it receives tracker
information of other users in the same VR and displays the avatars at the right translation
and rotation in the environment.













Figure 2.8: VR, Internets, GIS and there integration
2. DISTRIBUTED GEOGRAPHIC INFORMATION SYSTEMS 28
2.4. Distributed GIS for Decision Support
Here we present a brief description of some of the specific applications that use Distributed
GIS systems for Decision Support [1].
2.4.1 Land Management and Crop Yield Forecasting
As well as providing localized crop yield forecasting for an individual farm, there is a
demand for crop yield forecasting on a regional and national level. This requires the use of
large amounts of disparate data that may be distributed over many sources. If this data is
made available online, it can be integrated by a distributed GIS, which can also provide
support for including compute servers and interfacing to the software to perform the crop
yield prediction.
A distributed GIS has obvious applications to large-scale land care studies and
environmental monitoring, which also require large amounts of disparate geospatial data,
including satellite data. For example, a researcher may want to combine vegetation index
data obtained from NOAA satellite images with rainfall data obtained from weather
organization data and GMS-5 meteorological satellite images, in order to correlate rainfall
with vegetation growth, to provide better understanding of the environment and better
models for predicting crop yields, or providing localized predictions of rainfall and frost.
2.4.2 Defense Organizations
Many intelligence products originate from satellite or aerial reconnaissance flights and are
archived using various technologies including digital data systems. Various organizations
within a government's defense forces may collect and archive their own data and may make
it available to each other. Sources may vary in quality and type considerably. Some
organizations may work in an entirely in-house customized system, but more frequently
economic and interoperating reasons require various value-adding relationships to be set up
with the whole defense force to share data and to derive intelligence products from multiple
distributed sources and 'on-sell' derived products for decision support. Human analysts may
be working as end-users or as value-adders in the system itself, combining data products
2.4 Distributed GIS for Decision Support
into decision support material for those processes that are not yet automatable. The defence
community has some additional constraints but overall the model is very similar to that for
commercial value-adding of land resource data.
2.4.3 Emergency Services
There are two basic scenarios for the use of GIS by emergency services. The first is for
emergency planning purposes, to investigate measures for responding to emergencies such
as fire or flood. These systems allow the user to simulate possible situations and their
outcomes. For example, firefighters might run simulations of what would happen if a fire
was started in specified areas under certain conditions, in order to identify danger areas
where fuel loads should be reduced with cool burns, and under what conditions these
managed burns would be safe. Emergency services might run simulations of evacuation
procedures in the event of a flood or fire.
The second scenario is decision support during an actual emergency, which provides a much
greater challenge to a distributed GIS. As with emergency planning, the GIS will draw on a
wide variety of existing geospatial data, such as the positions of populated areas and houses,
road networks, fire stations and water hoses, etc. However in this case, it will also require
access to real-time data that may be rapidly changing. For example, a GIS for decision
support during a bushfire would ideally have real-time data feeds coming from many
different sources, including the positions of firefighters, police and emergency services
personnel in the field, information about the position of the firefronts, which can be
transmitted from helicopters with GPS units, and current and predicted weather information
such as temperature and wind speed.
A Web-based distributed GIS could be accessed in the emergency response headquarters, as
well as by crews in the field using a laptop connected by a cellular modem. The GIS could
provide up-to-the-minute information on the situation status, and provide decision support
information such as the optimal routing of fire trucks or evacuations, and predicting the path
of the fire using simulation.
2. DISTRIBUTED GEOGRAPHIC INFORMATION SYSTEMS 30
2.5. Summery
GIS data models are based on data layers or themes. Each layer can be represented in two
ways, raster and vector representations. The raster represents the layer as a collection of
pixels, while the vector represents each layer as a collection of points, lines, arcs, and
polygons. There are two common methods of raster representation, the exhaustive
representation and the run-length encoding. In both representations we find that the raster
representation needs more storage space than vector representation. The vector
representation can be done using two methods, the list of coordinates and the dual-
independent map encoding. The GIS on the network needs storage for the data, server and
an archive for caching the requested data. The most important factors in the Distributed GIS
systems are the time and accuracy. The GIS, Internet, and VR interconnected to make a high
performance VRGIS application. There are many applications for DGIS in decision support
systems. For example, the land management and crop yield forecasting, the defense
organization, and emergency services.
CHAPTER 3
IMAGE PROCESSING
This chapter explores the image processing methods and techniques for extracting the image
features. In section 3.1 the image segmentation techniques will be introduced. The image
segmentation techniques which will be explored are, the image segmentation by
thresholding, the histogram segmentation, and the Edge/Line segmentation. Then we will
introduce how to define the objects of the image. Section 3.2 introduces the cues of
extracting the third dimension of 2D images. These cues define how the human eye extracts
the third dimension of the view. Section 3.3 introduces the shape from stereo images and
the main problems of the stereo images; they are the correspondence and the reconstruction
problems. Then we will introduce the methods for solving the correspondence problem. The
image features will be explored at the end of this section. Section 3.4 introduces the shape
from shading showing how we can use the light patterns of 2D images to extract the third
dimension. Section 3.5 introduces the shape from texture showing how the texture of an
object can define its shape. Section 3.6 introduces a summery of this chapter.
3.1. Extracting image features by Image Segmentation
In the analysis of the objects in images it is essential that we can distinguish, for example
between the objects of interest and the background. The techniques that are used to find the
objects of interest are usually referred to as segmentation techniques - segmenting the
foreground from background. We will introduce the segmentation using thresholding then
histogram segmentation and finally how we obtain line and edge segments of the image
using Edge detector [12, 21, 26, 27].
3.1.1 Image Segmentation by Thresholding
Thresholding essentially involves turning a color or greyscale image into a 1-bit binary
image [26, 27]. This is done by allocating every pixel in the image either black or white,
3. IMAGE PROCESSING 32
depending on their value, Figure 3.1. The pivotal value that is used to decide whether any
given pixel is to be black or white is the threshold.




So, why do this? The principle reason is to segment the image. As the name implies,
segmentation tries to split a given image up into segments. Thresholding is the simplest
form of segmentation.
Often, we want to threshold images to gain a better understanding of the shape or objects
within a scene. Many machine vision techniques require a binary image as input.
3.1.2 Histogram Segmentation
A histogram is one of the simplest methods of analyzing an image. An image histogram
maintains a count of the frequency for a given color level [26, 27].
When graphed, a histogram can provide a good representation of the color spread of the
image. Histograms can also be used to equalize the image as well as providing a large
number of statistics about it. Here is an example of a histogram for the grayscale version of an
aircraft image shown in figure 3.2.
Note how the majority of the colors seem to lie between about 80 and 120, which
corresponds to the dark grey background of the image.
The histogram segmentation divides the histogram of the image into a specified number of
groups each group contains the pixels with the relatively similar intensity values.

Figure 3.1: An image of a coin and the same image thresholded
3.1 Extracting Image Features by Image Segementation
Figure 3.3: Su47 Aircraft image histogram after applying histogram segmentation
Figure 3.4: Su47 Aircraft image histogram after applying histogram segmentation
Figure 3.2: Su47 Aircraft image and its histogram







We are going to show the histogram segmentation by applying a 4-groups segmentation on
the Aircraft image shown in Figure 3.2 the result is shown in Figure 3.3







The four groups will represent four segmented images the segments that appeared here are
regions Figure 3.4.



3. IMAGE PROCESSING 34
3.1.3 Edge/Line Detection
This section will introduce the edge detection by looking at the simple (but effective) Sobel
edge detector [26, 27, 25]. Obviously, the first thing we need to ask is "What exactly is an
edge?", If we look at the sample picture shown in Figure 3.5(left) and try to understand what
an edge is it is quite easy to figure out. In a picture, an edge is normally defined as an abrupt
change in color intensity; Of course, humans use a much more complicated method to find
edges (so complex, we don't know how it works yet). This is because we have two eyes
(therefore stereoscopic vision and depth perception) as well as our incredible inference
skills (we can "see" the grey square, Figure 3.5(left), despite it being obscured by the circle),
Despite this, most computer vision systems must do with one (normally grayscale) camera,
so change in color intensity is the next best thing! So, firstly let us look at the Sobel Edge
Detector. The Sobel Edge Detector uses a simple convolution mask to create a series of
gradient magnitudes; it uses two convolution masks, one to detect changes in vertical
contrast (hx) and another to detect horizontal contrast (hy).
(
(
(


=
(
(
(

=
1 2 1
0 0 0
1 2 1
,
1 0 1
2 0 2
1 0 1
y x
h h (3.1)

Figure 3.5 shows what the Sobel masks do to a simple illustrative picture








The amazing thing is that this data can now be represented as a vector (gradient vector). The
two gradients computed using h
x
and h
y
can be regarded as the x and y components of the
vector. Therefore we have a gradient magnitude and direction:
Figure 3.5: Left, the original image. Right, the image after applying the Sobel edge
detector
3.1 Extracting Image Features by Image Segementation
(

=
y
x
v
g
g
G
2 2
y x m
g g G + = (3.2)
|
|

\
|
=

x
y
g
g
1
tan
Where Gv is the gradient vector, Gm is the gradient magnitude and is the gradient
direction. All keen programmers will notice that it is probably more efficient to simple
calculate the magnitude by adding the absolute squares - this is indeed what many
implementation of the Sobel detector do.We can see that when Gx and Gy are large (big
changes in the vertical and horizontal orientation respectively), the gradient magnitude will
also be large. It is the gradient magnitudes that are finally plotted.It is appear how the
intensity of the edge between the circle and the square is less than the intensity between the
circle and the background.

3.1.4 Identifying Objects of the image by labeling.
Connected-component labeling is a method for identifying each object in a binary image;
Connected Components are set of pixels in which each pixel is connected to all other pixels.
The connection between components depends on the neighborhood of the pixels of the
component, there are two types of pixel neighborhood 4-connected neighborhood and 8-
connected neighborhood Figure 3.6 shows both types [26, 27].




A component labeling procedure finds all connected components in an image and assigns a
unique label to all points in the same component. This process returns a matrix, called a
label matrix shown in Figure 3.7. A label matrix is an image with the same size of the input
Figure 3.6: 4-connected and 8-connected neighborhoods
3. IMAGE PROCESSING 36
image, in which the objects in the input image are distinguished by different integer values
in the output matrix





After getting the labeling matrix the objects can be processed now, i.e. we can get all
properties that can help us in our application such as the area, centroid, perimeter, aspect
ratio and orientation.
The area of an object can be measured by the number of pixel in the object. The centroid is
the position of the middle pixel of the object. The perimeter is the length of the rounding
curve that surrounds the object. The aspect ration is the ratio between the height and the
width of an object. The orientation is the slope of the axe that divides the object in a similar
manner.
3.2. Depth perception of 2D image
Depth perception is based on 10 cues. These cues contain information which, when added to
the 2D image projected onto the retina, allow us to relate the objects of the image to 3D
space, so we can easily define which object is near and which is far away.
There are four physiological and six psychological cues [28].

3.2.1. The Four Physiological Cues
1. Accommodation, the adjustment of the focal length of the lens, this happens when an
object is away and start to be nearer to us, Figure 3.8.

Figure 3.7: (Left) Original image with objects in black. (Right) the labeling matrix
3.2 Depth Preception of 2D Images










2. Convergence, the angle made by the two viewing axes of a pair of eyes, Figure 3.9;










Accommodation & convergence are associated with the eye muscles, and interact with each
other in depth perception. Accommodation is considered a monocular depth cue since it is
available even when we see with a single eye. This cue is effective only when combined
with other binocular cues, and for a viewing distance of less than two meters (Okoshi,
1976). Accommodation and convergence are considered to be minor cues in depth
perception.
3. Binocular Disparity, Binocular disparity is considered the most important depth
perception cue over medium viewing distances. Binocular disparity is the difference
between the images of the same object projected onto each retina Figure 3.10. Computer
systems simulate this cue using the stereo images. Stereo images have an important problem
Figure 3.8: Accommodation, the adjustment of focal length
Figure 3.9: Convergence, the angle made by the two viewing axes of a pair of eyes
3. IMAGE PROCESSING 38
it is called the correspondence. After solving the correspondence we can get the disparity.
More information about stereo images will be introduced in section 3.3.









The degree of disparity between the two images depends on the parallactic (convergence)
angle. This is the angle formed by the optical axes of each eye converging on an object. The
parallactic angle is related to the distance of an object from the eyes.
At great distances the parallactic angle decreases and depth perception becomes
increasingly difficult. The smallest parallactic angle the average person is able to discern, is
three arc seconds.
4. Motion parallax, the result of changing positions of an object in space due to either the
motion of the object, or of the viewer's head. Visual motion parallax is a function of the rate
at which the image of an object moves across the retina. Distant objects will appear slow in
comparison with close objects even when the two are moving at the same speed. Motion
parallax can also be caused by the movement of the viewer's head. Objects closest to the
observer will appear to move faster than those far away.
As an example when you travel in a bus and you look through the glass window, you will
find the near objects -the bricks of the platform - appear as they are running faster than the
objects that far away the road like aerial trees or towers. This is an important cue to those
who only have the use of one eye. This cue can be used in computer systems to infer the
third dimension. Refer to section 1.3 for more details about shape from motion.
Figure 3.10: Binocular disparity, the difference between tow images of the same object
3.2 Depth Preception of 2D Images
3.2.2. The Six Psychological Cues
1. Retinal Image Size, the larger an object image the closer it appears, it is applied only
on the same objects of the same physical volume, if they are different in volume and be put
in different distances from the eye, the observer may be confused, Figure 3.11.








2. Linear Perspective: the gradual reduction of image size as distance from the object
increases, Figure 3.12;







Figure 3.11 Retinal Image Size, the larger an object image the closer it appears
Figure 3.12 Linear Perspective: the gradual reduction of image size
3. IMAGE PROCESSING 40
3. Aerial Perspective, the haziness of distant objects, Figure 3.13;








4. Overlapping, the effect where continuous outlines appear closer to the observer,
Ancient Egyptians (Pharaoh's) painted their history in flat images containing overlapped
objects to indicate the third dimension, Figure 3.14.








Figure 3.13 Aerial Perspective, the haziness of distant objects
Figure 3.14 Overlapping, the effect where continuous outlines appear closer to the observer
3.2 Depth Preception of 2D Images
5. Shade and Shadows, the impression of convexity or concavity based on the fact that
most illumination is from above, eye can entirely discriminate the object using only its
Shade and Shadows, Figure 3.15. In computer systems the light patterns and shadows are
used to infer the third dimension. For more information about shape from shading will be
given in section 3.4.






6. Texture Gradient, a kind of linear perspective describing levels of roughness of a
uniform material as it recedes into the distance, Figure 3.16. Computer systems use the
texture of an object to infer its shape. More information about shape from texture will be
given in section 3.5.







Figure 3.15 Shade and Shadows, the impression of convexity or concavity
Figure 3.16 Texture Gradient, a kind of linear perspective describing levels of roughness
3. IMAGE PROCESSING 42
Psychological cues are learned cues; therefore, they are assisted by experience. When
combined, these cues enhance depth perception greatly. Stereo viewing of images usually
combines binocular disparity and shade and shadow cues for effective depth perception.
3.3- Shape from Stereo images
Stereo images are a pair of images taken for the same view from different positions Figure
3.17, it looks like the human eye that see stereo since the pair of eyes take pair of images of
the same view, using stereo image pair we can define the depth [29, 30, 19].The objects in
the images are not found in the same position in the pair of images instead we find the
objects shifted in the image, the shift expresses the depth, and to find the shift of the objects
we have to make a correspondence [13, 14, 15].
Given two images formed in the retinal planes

and
'

, tow problems should be solved in


shape from stereo.
For a point m in

, determine which point m' in plane


'

that it corresponds to. The


term correspond means that they are the images of the same physical point M. This is
what is commonly known as the correspondence problem [16, 17, 18, 19].
Given two corresponding points m and m', compute the 3-D coordinates of M
relative to some global reference frame. This is known as the reconstruction problem
[20, 12].












Figure 3.17 Shape from stereo
3.3 Shape Form Stereo


Approaches to the correspondence problem can be broadly classified into two categories:
the intensity-based matching and the feature-based matching techniques. In the first
category, the matching process is applied directly to the intensity profiles of the two images,
while in the second, features are first extracted from the images and the matching process is
applied to the features [19].
3.3.1 Intensity-based or Area-based stereo matching
In the parallel camera case, the epipolar lines coincide with the horizontal scanlines, the
corresponding points in both images must therefore lie on the same horizontal scanline.
Such stereo configurations reduce the search for correspondences from two-dimensions (the
entire image) to one-dimension [ 8, 14, 16, 19, 31].
The advantage of this intensity profile matching is that a dense disparity map and,
consequently a dense depth (or range) map, is output. Unfortunately, like all constrained
optimization problems, whether the system would converge to the global minima is still an
open problem, although,, the multiresolution scheme, to a certain extent, helped speed up
convergence and avoid local minima.
An alternative approach in intensity-based stereo matching, commonly known as the
window-based method [14, 17], is to only match those regions in the images that are
"interesting", for instance, regions that contain high variation of intensity values in the
horizontal, vertical, and diagonal directions. The simple Moravec's interest operator detects
such regions (correspond to regions that have grey-level corners) from the image pair, and it
has been widely used in many stereo matching systems (e.g. the SRI STEREOSYS system).
After the interesting regions are detected, a simple correlation scheme is applied in the
matching process; a match is assigned to regions that are highly correlated in the two
images.
The problem associated with this window-based approach is that the size of the correlation
windows must be carefully chosen. If the correlation windows are too small, the intensity
variation in the windows will not be distinctive enough, and many false matches may result.
If they are too large, resolution is lost, since neighboring image regions with different
3. IMAGE PROCESSING 44
disparities will be combined in the measurement. Worse, the two windows may not
correlate unless the disparity within the windows is constant, which suggests that the
multiresolution scheme is again appropriate.
3.3.2 Feature-based stereo matching
In the feature-based approach, the image pair is first preprocessed by an operator so as to
extract the features that are stable under the change of viewpoint [14, 17, 19, 32], the
matching process is then applied to the attributes associated with the detected features. The
obvious question here is what type of features that one should use? Edge elements, corners,
line segments, and curve segments are features that are robust against the change of
perspective, and they have been widely used in many stereo vision work. Edge elements and
corners are easy to detect, but may suffer from occlusion; line and curve segments require
extra computation time, but are more robust against occlusion (they are longer and so are
less likely to be completely occluded). Higher level image features such as circles, ellipses,
and polygonal regions have also been used as features for stereo matching, these features
are, however, restricted to images of indoor scenes.
Most feature-based stereo matching systems are not restricted to using only a specific type
of features, instead, a collection of feature types is incorporated. For instance, the system
proposed by Weng in 1988 combines intensity, edges, and corners to form multiple
attributes for matching; Lim and Binford (1987), on the other hand, used a hierarchy of
features varying from edges, curves, to surfaces and bodies (2-D regions) for high-level
attribute matching.
3.3.2.1 Types of features
a. edge elements: There exist many edge operators for finding edge elements from an
image. For example, the G
2
operator followed by a detection of zero-crossings; the
Canny edge detector [33].
The attributes of edge elements used for matching can be: coordinates (location in the
image), local orientations (or directions), local intensity profile on either side of the edge
elements (e.g. from dark to light, or from light to dark).
3.3 Shape Form Stereo

b. corners: The earliest corner detector is probably Beaudet's (1978) rotationally


invariant operator called DET; corner detectors reported in the 80s include Dreschler and
Nagel (1982), Kitchen and Rosenfeld (1982), Zuniga and Haralick (1983), etc. The
Harris corner detector (1988) is one of the popular corner detectors that are widely used
today in Oxford and INRIA.
Attributes of corners that can be used for matching: coordinates of corners, type of
junctions that the corners correspond to (e.g. Y-junction, L-function, A-junction, etc.).
c. line segments: To extract line segments from an image, an edge operator must first be
applied. Line segments are then formed by a linking and merging operation on the
detected edge elements based on some criteria such as distance, similarity, and
collinearity measures [20].
It should be noted that, due to image noise, the end-points (and thus the mid-points also)
of line segments are normally not reliably detected, stereo matching process that relies
on the coordinates of these points do not produce good reconstruction of 3-D
coordinates. In fact, for a pair of matching line segments, any point on the first line
segment can correspond to every other point on the second line segment, and this
ambiguity can only be resolved if the end-points of the two line segments are known
exactly.
4. curve segments: The matching of curve segments has not been widely attempted, the
reason is probably due to the ambiguity involved -- every point on a curve is likely to be
matchable with every other point on another curve. Deriche and Faugeras' work (1990)
is one of the very few that has been reported. They proposed to match the turning points
of curves.
5. circles, ellipses: These features are present mainly in indoor scenes and applicable to
detection of defects on industrial parts.
Attributes that can be used for matching: areas in pixel units, coordinates of the centre of
the geometric figures.

3. IMAGE PROCESSING 46
6. regions: Regions can be either defined as blobs (e.g. detected by a region growing
algorithm) in the image or defined as polygonal regions bounded by line segments.
Regions in the form of blobs have irregular boundary and may not match perfectly with
regions from another image [6].
For polygonal regions, attributes that can be used for matching include: areas of regions,
bounding line segments of regions, locations of regions' centroids.
Polygonal regions are very high-level features and could be costly to extract.
3.4 - Shape from Shading
Shape from shading uses the pattern of shading in a single image to infer the shape of the
surface in view [28]. A typical example of shape from shading is astronomy, where the
technique is used to reconstruct the surface of a planet from photographs acquired by a
spacecraft.
The reason that shape can be reconstructed from shading is the link between image intensity
and surface slope. The radiance at an image point can be calculated with the surface normal,
the direction of the illumination (pointing towards the light source) and the albedo of the
surface, which is typical of the surface's material. After calculating the radiance for each
point we get the reflectance map of the image.
The parameter of the reflectance map might be unknown. In this case we have to estimate
the albedo and illuminant direction. Albedo and illuminant can be computed, by assuming
we are looking at a lambertian surface, with help of the averages of the image brightness
and its derivatives. From the reflection map and by assuming local surface smoothness, we
can estimate local surface normals, which can be integrated to give local surface shape
The image in Figure 3.18 (right) is a portion of frame LUE50467 acquired by the
Clementine spacecraft in its 218-th orbit around the moon. The image has been rotated so
that the sun is to the left. The middle image is the elevation surface computed by the shape-
from-shading algorithm.




3.5 Shape Form Texture
















One method of validating this result is to compute a shaded rendition from the elevation
surface under the same illumination and viewing conditions as the original image. The
shaded rendition (right) compares favorably with the original image (left). Once the
elevations are known one can generate simulated perspective views of the scene.
We conclude that when stereo imagery does exist, shape-from-shading provides an
alternative method for extracting terrain data. Although its use has been limited primarily to
constant-albedo planetary mapping applications (i.e., where the surface is covered more-or-
less by the same material) new algorithms currently under development will extend to the
general case in terrestrial imaging applications where the albedo is not constant.

3.5 - Shape from Texture
Shape from texture is a computer vision technique where a 3D object is reconstructed from
a 2D image [29]. Although human perception is capable to realize patterns, estimate depth
and recognize objects in an image by using texture as a cue, the creation of a system able to
mimic that behavior is far from trivial.
Figure 3.18 a portion of frame LUE50467 acquired by the
Clementine spacecraft in its 218-th orbit around the moon
3. IMAGE PROCESSING 48
Although texture as a meaning is difficult to describe in our case we mean the repetition of
an element or the appearance of a specific template over a surface. Such element or surface
is called texel (TEXture ELement). Various textures can be seen in Figure 3.19.











Consider two textured patches of a surface in the scene. Even if the patches have the same
texture pattern, in an image they will appear slightly different because of the slightly
different orientation that they have with respect to the observer's eye or camera.
















Figure 3.19 perception of the third dimension of 2D image using
information from texture.
Figure 3.20 An image perceived by most people as a hollow
cylinder covered in checkered pattern
3.6 Summery

Although the image in Figure 3.20 is nothing more than a two-dimensional image, it
perceived by most people as a three-dimensional cylinder. Computer algorithms can be
created to allow computer to arrive at the same conclusion. It can be done by interpreting
the apparent visual distortion of the texture.
For example the checkered pattern in Figure 3.20 appears to compress towards the left and
the right edges of the cylinder, giving the appearance that the surface is curving a way from
the observer. By analyzing the mathematical relationship between an object's shape and this
form of distortion (as well as other forms of distortion), one may find shape-from-texture
solutions.

3.6 Summery
There are multiple techniques for extracting the image features. These techniques are called
the image segmentation techniques. These techniques are segmentation by theresholding,
Histogram segmentation, and Edge/Line segmentation. The method for identifying the
objects of the image is called the image labeling, in which each object is assigned an
identifier. Humans can extract the third dimension of the view using ten cues. There are four
physiological cues; they are accommodation, binocular disparity, and motion parallax. Also
there are six psychological cues; they are retinal image size, linear perspective, aerial
perspective, overlapping, shade and shadows, and texture gradient. Computer systems try to
simulate the human behavior to extract the third dimension of the 2D images. Systems for
extracting the third dimension of 2D images are shape from stereo, shape from shading, and
shape from texture. In shape from stereo we have to solve two main problems, the
correspondence problem and the reconstruction problem. Methods for solving the
correspondence problem can be broadly classified to feature-based correspondence and
area-based correspondence. In feature based correspondence the features are extracted form
the pair of stereo images, then the matching is made between features of the right image and
the features of the left image. In area-based correspondence the matching is made pixel by
pixel. The last method gives more dense disparity maps.
CHAPTER 4
COMPUTER GRAPHICS
This chapter explores the background theories and techniques of computer graphics.
Section 4.1 introduces the computer graphics representation techniques. We will introduce
the anti-aliasing strategies. The most common anti-aliasing strategies are pre-filtering and
the super-sampling. Also we will give a comparative study between vector and raster
graphics. Section 4.2 presents the geometric model which show the way of representing the
3D models in computer memory. Section 4.3 presents the rendering techniques of the 3D
view. We will define the projection and rasterization. We will introduce also what is meant
by visibility in computer graphics. Also we will show the effect of lights, shadows and
materials on the rendering process. Section 4.4 gives a summery of this chapter.

4.1. Computer Graphics Representation.
4.1.1 Anti-aliasing strategies
Anti-aliasing techniques are required in low resolution images to remove the aliasing effect
[24, 34]. In the case of the line, it consists of using intermediate gray levels to smooth the
appearance of the line Figure 4.1. Another form of aliasing can be observed on television
when people wear shirts with a fine stripped texture. A flickering pattern is observed
because the size of the pattern is on the same order of magnitude as the pixel size.






There are three main classes of anti-aliasing algorithms.
Figure 4.1: A line without and with anti-aliasing
4.1 Computer Graphics Representation

- As aliasing problem is due to low resolution, one easy solution is to increase the
resolution, causing sample points to occur more frequently. This increases the cost of image
production.
- The image is created at high resolution and then digitally filtered. This method is called
super-sampling or post-filtering and eliminates high frequencies which are the source of
aliases.
- The image can be calculated by considering the intensities over a particular region. This is
called pre-filtering.
4.1.1.1 Pre-filtering.
Pre-filtering methods treat a pixel as an area [34], and compute pixel color based on the
overlap of the scene's objects with a pixel's area. These techniques compute the shades of
gray based on how much of a pixel's area is covered by a object, Figure 4.2.
For example, a modification to Bresenham's algorithm was developed by Pitteway and
Watkinson. In this algorithm, each pixel is given intensity depending on the area of overlap
of the pixel and the line. So, due to the blurring effect along the line edges, the effect of
anti-aliasing is not very prominent, although it still exists.
Pre-filtering thus amounts to sampling the shape of the object very densely within a pixel
region. For shapes other than polygons, this can be very computationally intensive.








Figure 4.2: (left): Without anti-aliasing, the jaggies are harshly evident, (right): pre-filtered
image, Along the border, the colors are a mixture of the foreground and background colors.
4. COMPUTER GRAPHICS 52
4.1.1.2 Super-sampling.
Super-sampling or post-filtering is the process by which aliasing effects in graphics are
reduced by increasing the frequency of the sampling grid and then averaging the results
down [34]. This process means calculating a virtual image at a higher spatial resolution than
the frame store resolution and then averaging down to the final resolution. It is called post-
filtering as the filtering is carried out after sampling.
There are two drawbacks to this method:
1 - The drawback is that there is a technical and economic limit for increasing the
resolution of the virtual image.
2 - Since the frequency of images can extend to infinity, it just reduces aliasing by raising
the Nyquist limit - shift the effect of the frequency spectrum.
Super-sampling is basically a three stage process:
1- A continuous image I(x,y) is sampled at n times the final resolution. The image is
calculated at n times the frame resolution. This is a virtual image.
2 - The virtual image is then low-pass filtered.
3 - The filtered image is then resampled at the final frame resolution.

4.1.2. Vector vs. Raster graphics
Graphic images that have been processed by a computer can usually be divided into two
distinct categories. Such images are either bitmap files or vector graphics, We need a good
comprehension on the advantages and disadvantages of both types of data. This section tries
to explain these differences. As a general rule scanned images are bitmap files while
drawings made in applications like Corel Draw or Illustrator are saved as vector graphics.
But We can convert images between these two data types and it is even possible to mix
them in a file. This sometimes confuses people [23, 24, 27].


4.1 Computer Graphics Representation

4.1.2.1 Raster Graphics


Bitmaps images or raster images are exactly what their name says they are: a collection of
bits that form an image. The image consists of a matrix of individual dots (or pixels) that all
have their own colour (described using bits, the smallest possible units of information for a
computer).Lets take a look at a typical bitmap image to demonstrate the principle:





As shown in Figure 4.3 To the left we see an image and to the right a 250 percent
enlargement of the top of one of the mountains. As we can see the image consists of
hundreds of rows and columns of small elements that all have their own color. One such
element is called a pixel -short for picture element). The human eye is not capable of seeing
each individual pixel so we perceive a picture with smooth gradations. The number of pixels
we need to get a realistic looking image depends on the way the image will be used.
4.1.2.2 Vector Graphics
Vector graphics are images that are completely described using mathematical definitions.
The image below - Figure 4.4 - shows the principle. To the left we see the image itself and
to the right we see the actual lines that make up the drawing see Figure 4.3.
Each individual line is made up of either a vast collection of points with lines
interconnecting all of them or just a few control points that are connected using so called
bezier curves. It is this latter method that generates the best results and that is used by most
drawing programs.
Figure 4.3 : Magnified part of a Raster image or Bitmap image
4. COMPUTER GRAPHICS 54
Figure 4.4 Vector Graphics shown on an image









Vector drawings are usually pretty small files because they only contain data about the
bezier curves that form the drawing. The EPS-file format that is often used to store vector
drawings includes a bitmap preview image along the bezier data. The file size of this
preview image is usually larger than the actual bezier data themselves.
Vector drawings can usually be scaled without any loss in quality. This makes them ideal
for company logo's, maps or other objects that have to be resized frequently. Please note
that not all vector drawings can be scaled as much as we like:
Drawings containing trapping information can only be scaled up to 20 percent larger
or smaller.
Thin lines may disappear if a vector drawing is reduced too much.
Small errors in a drawing may become visible as soon as it is enlarged too much.
It is fairly easy to create a vector based drawing that is very difficult to output. Especially
the use of tiles (small objects that are repeated dozens or hundreds of times) and Corel Draw
lens effects can lead to very complex files.
4.2. The Geometric Model

We introduce how the geometry of the scene is represented in the memory of the computer
[35].
Polygons, The most classical method for modeling 3D geometry is the use of polygons. An
object is approximated by a polygonal mesh, that is a set of connected polygons (see Figure
4.5 ). Most of the time, triangles are used for simplicity and generality.
4.2 The Geometric Model


















Each polygon or triangle can be described by the 3D coordinates of its list of vertices (see
Figure 4.5). The obvious limitation of triangles is that they produce a flat and geometric
appearance. However, techniques called smoothing or interpolation can greatly improve
this.

Primitives, The most classical geometric entities can be directly used as primitives, e.g.
cubes, cylinders, spheres and cones. A sphere for example can be simply described by the
coordinates of its center and its radius.

Smooth patches, more complex mathematical entities permit the representation of complex
smooth objects. Spline patches and NURBS are the most popular. They are however harder
to manipulate since one does not directly control the surface but so called control points that
are only indirectly related to the final shape.
Moreover, obtaining smooth junctions between different patches can be problematic.
However, the recently popular subdivision surfaces overcome this limitation.
Figure 4.5 Left: A cow modeled as a mesh of triangles.
Right: This triangle can be stored using the coordinates of its vertices as [(2,4,2), (3,1,0), (1,1,2)].

4. COMPUTER GRAPHICS 56
They offer the best of both worlds and provide the simplicity of polygons and the
smoothness of patches.

4.3. Rendering

4.3.1 Projection and Rasterization
The image projection of the 3D objects is computed using linear perspective. Given the
position of the viewpoint and some camera parameters (e.g. field of view), it is very easy to
compute the projection of a 3D point onto the 2D image. For mathematics enthusiasts, this
can be simply expressed using a 4*4 matrix [23, 35].
In most methods, the geometric entities are then rasterized. It consists in drawing all the
pixels covered by the entity.
In the example below, the projections of the 3D points have been computed using linear
perspective, and the triangle has then been rasterized by filling the pixels in black see Figure
4.6. For richer rendering, the color of each rasterized pixel must take into account the
optical properties of the object.











4.3.2 Visibility
If the scene contains more than one object, occlusions may occur. That is, some objects may
be hidden by others. Only visible objects should be represented. Visibility techniques deal
with this issue [24, 35].
Figure 4.6 Projection (left) and rasterization (right) of a triangle.

Rendering

One classical algorithm that solves the visibility problem is the so-called painters
algorithm.
It consists in sorting the objects or polygons from back to front and rasterizing them in this
order. This way, frontmost polygons cover the more distant polygons that they hide see
Figure 4.7.











The ray-tracing algorithm Figure 4.8 - does not use a rasterization phase. It sends one ray
from the eye and through each pixel of the image. The intersection between this ray and the
objects of the scene is computed, and only the closest intersection is considered.














Figure 4.7 The Painters algorithm. Triangle 1 is drawn first because it is more distant.
Triangle 2 is drawn next and covers Triangle 1, which yields correct occlusion.

Figure 4.8 Ray-tracing. A ray is sent from the eye and through the pixel. Since the
intersection with 2 is closer than the intersection with 1, the pixel is black.

4. COMPUTER GRAPHICS 58
The z-buffer method is the most common nowadays (e.g. for computer graphics cards). It
stores the depth (z) of each pixel.
When a new polygon is rasterized, for each pixel, the algorithm compares the depth of the
current polygon and the depth of the pixel. If the new polygon has a closer depth, the color
and depth of the pixel are updated. Otherwise, it means that for this pixel, a formerly drawn
polygon hides the current polygon.

4.3.3 Shading and materials
Augmenting the scene with light sources allows for better rendering. The objects can be
shaded according to their interaction with light. Various shading models have been
proposed in the literature [24, 35].

They describe how light is reflected by object, depending on the relative orientation of the
surface, light source and viewpoint, Figure 4.9














Texture mapping uses 2D images that are mapped on the 3D models to improve their
appearance, Figure 4.10.
Figure 4.9 Light reflection model. The ratio of light bouncing off the surface in the
direction of the eye depends on the two angles.

Rendering












4.3.4 Shadows and lighting simulation
Shading and material models only take into account the local interaction of surfaces and
light [28, 35]. They don't simulate shadows that are harder to handle because they imply
long-range interactions. A shadow is caused by the occlusion of light by one object.
Ray-tracing, for example, can handle shadows, but requires a shadow computation for each
pixel and each light source, Figure 4.11. A shadow ray is sent from the visible point to the
light source. If the ray intersects an object, then the visible point is in shadow.














Figure 4.10 Sphere rendered using various material models. Note in particular the
different highlights. Image from the Columbia-Utrecht Reflectance and Texture
Database.

Figure 4.11 Shadow computation using ray-tracing, The visible point is in shadow
because the black triangle occludes the light source.

4. COMPUTER GRAPHICS 60
More complex lighting interactions can then be simulated. In particular, objects that are
illuminated by a primary light source reflect light and produce indirect lighting. This is
particularly important for indoor scenes. Global lighting methods take into account all light
inter-reflections within the scene.
4.4. Summery
In this chapter we explored the computer graphics representation techniques. The anti-
aliasing strategies are the methods of removing the aliasing effect from the computer
graphics. The most common anti-aliasing strategies are the pre-filtering and the super-
sampling strategies. There are two types of computer graphics representation the vector
graphics and the raster graphics. The vector graphics are images that are completely
described using mathematical definition. While the raster graphics represents the graphics as
a collection of pixels. The geometric model is the way of representing the scene in computer
memory. The rendering of the 3D scene can be done using multiple operations like,
projection and rasterization, definition of visibility of the objects in the scene, and defining
the effect of lights, shadows and materials on the rendering of the scene.

CHAPTER 5

ENHANCED SHAPE FROM STEREO SYSTEM
IN DISTRIBUTED GIS
This chapter presents the main theme of the thesis, namely the proposed enhanced shape
from stereo system. This system is used to convert 2D images to 3D models. It showed
superiority over other available systems used for the same purpose. Section 5.1 shows the
proposed Enhanced shape from stereo system, which converts the 2D images to 3D models.
First we will give a summery of the stages of our system. Then we applied the system on a
pair of stereo images, showing the output of every stage in details. Section 5.2 shows the
results of our system. We introduced first the data acquisition techniques, and then a
comparative study is done between the results of the traditional shape from stereo and the
results of our Enhanced shape from stereo system. Section 5.3 gives a summery of this
chapter.
5.1. The Enhanced Shape from Stereo System.
The distributed GIS system that we are going to implement contains multiple servers on a
simple computer network (LAN), Figure 5.1. It can be widened to be used on the internet, it
contains multiple servers for serving multiple types of users, and one of these servers is the
GIS server.
The GIS server contains the following:
1- Software for converting 2D images to 3D, since we are going to make the geographical
data available in 2D and 3D. This is the system which we are going to implement, i.e. we
are going to implement software for converting the 2D images to 3D models using the
enhanced shape from stereo system.
2 The GIS database, it is indexed in image pairs to be used in the enhanced shape from
stereo technique, which converts 2D images into 3D models.
Now the GIS user can use the geographical data in two forms (2D by using only the images
which are taken for the specified area), or (3D by converting the 2D images into 3D models
by using the Enhanced shape from stereo system).
5. THE ENHANCED SHAPE FROM STEREO SYSTEM IN DISTRIBUTED GIS 62
















Our objective is to the support the GIS system by making the geographical data available in
2D and 3D forms. This is done by extracting the depth from images using a pair of stereo
images. The Enhanced shape from stereo system consists of five stages. First we will extract
features using histogram segmentation, then we will match these features and get a disparity
map that will be used in the further steps, then we will get a dense disparity map. The
dataflow diagram is shown in Figure 5.2
In stage1 the features that will be extracted from both the right and the left images will be
region segments and will be extracted as follow:
a - Find the histogram for the right and the left images
b - Choose three threshold values in the histogram that will divide the image into four
groups each group contains different gray level region segments
c - For each group create a series of ( low, medium and high ) resolution images , it is
called the multiresolution images for each group. And it is created by reduction in size
by a factor of two using Gaussian convolution filter.
d - Create the binary map for each of multiresolution images that indicate for each pixel
whether it belongs to the group or not.
Figure 5.1: The structure of Distributed GIS system
The Enhanced Shape from Stereo System


e - We apply the open operator [26] in the morphology filter on the low resolution images
for each group to remove the small region segments that may incur errors.
















Using the region segments as the feature elements we can get the disparity map ( it is not the
final disparity map) , then we use it to find the center of search range as follow:
a - Using the low resolution images for each group after resizing it to the original size we
get the centroid for every region
b - Match the regions using geometric properties such as position, size, eccentricity and
orientation.
c - Find for each pixel in the right image its disparity and the center of search range, as
shown in Figure 5.3
d - The disparity for each pixel in region S is computed as the distance between the
centroids of the region S in both right and left images
e - The center of search range for a pixel in the right image is the pixel in left image which
is apart from itself by its disparity.

Stage 1
Image Pair
Stage 2 Stage 3
Stage 4
Stage 5
Figure 5.2 The Data Flow Diagram of the Enhanced Shape from
stereo System.
5. THE ENHANCED SHAPE FROM STEREO SYSTEM IN DISTRIBUTED GIS 64










In stage2 ,first we will get the interval of search range using the multiresolution images
taken from step 1 ( feature extraction ) taking into consideration that the interval of search
range for pixels in region segments will increase as it be more close to the boundary of the
region segment to reflect the abrupt change in the boundary of the region segments.
We get the interval of search range using the inclusion property as follow:
a. Remove the region segments which is not appear in the low resolution images which
have insignificant influence but might incur errors.
b. Resize the multiresolution images to its original high resolution size.
c. We get three sub-regions SR1,SR2 and SR3 as shown in the Figure 5.4 :
d. Let Ii be the interval of the search range for a pixel p which lies in SRi , Then, we
determine, for each pixel p, its interval of the search range according to in which sub-
region it lies while satisfying I1 < I2 < I3.
In stage 3 we will get the adaptive window size using the following steps
a. Apply 2D 3-levels wavelet transform to the right image
b. Determine whether a pixel is an edge pixel or not for each level of 2D wavelet
transform.
c. The edge strength of the pixel can be regarded as more strong if the pixel is detected as
edge component in the lower levels.
c.1 If the pixel is detected as edge component in all levels then the edge strength will be
the largest and the size of the search window is set to minimum value in order to
capture the disparity edge more precisely ,
Disparity of Sl
Centroid of Sr
Centroid of Sl
Sl Sl
Sr
Left Image Right Image
Figure 5.3 regions and their disparity
The Enhanced Shape from Stereo System

c.2 If the pixel is not detected as edge component at higher levels, the edge strength
becomes smaller at the pixel and the size of the search window will be increased.
c.3 If an edge is not detected at any level then the size of the search window is set to
maximum value.


















In stage 4 we will make an area-based matching, after computing the search range and the
window size we obtain the final disparity map. Two steps needed in this stage:
a. Match each pixel in the right image with the pixels residing in its search range of the
left image using the appropriate window size, and compute the disparity map by finding
the pixel in the left image with the maximum correlation between two image pairs in the
corresponding window.
b. Perform the refinement of the disparity map by applying median filter in order to satisfy
the continuity constraint.
Figure 5.4: Calculating the search range
5. THE ENHANCED SHAPE FROM STEREO SYSTEM IN DISTRIBUTED GIS 66
In stage 5 We will get the 3D model, we have now the depth or the third dimension, so we
can get a complete 3D model of our scene using the image pair and the depth map or
disparity map, the stages of the system are shown in Figure 5.5 and Figure 5.6


























Figure 5.5: Stage 1 of the Enhanced Shape from Stereo System
Create 3 groups by
dividing the histogram
into 3 groups
Create 3 groups by
dividing the histogram
into 3 groups
Apply the Gaussian
convolution filter to
create multiresolution
Apply the Gaussian
convolution filter to
create multiresolution
Morphology filter
( open operator )
Morphology filter
( open operator )
Get region properties for
every group
Get region properties for
every group
Match ( every group ) with its correspondent group
Get the optical flow
Calculate the adaptive window
size
Disparity map
Stage 1
The Enhanced Shape from Stereo System



















Figure 5.6: Stages from 2-5 of the Enhanced Shape from Stereo System
Disparity
from Stage 1
Calculate the center
of search range
Low
resolution
Medium
resolution
High
resolution
Calculate the
interval of search
range for each pixel
Apply 3-levels
Wavelet Transform
Calculate adaptive
window size
The Search Range (interval ,center )
Area Based Matching
( Getting the final disparity map )
Apply Median Filter
The refinement of the
Disparity map
Stage 2 Stage 3
Stage 4
Stage 5
Get the 3D Model of the
view
5. THE ENHANCED SHAPE FROM STEREO SYSTEM IN DISTRIBUTED GIS 68
5.1.1 Stage 1, Feature-based Matching
We will take the pentagon (left and right stereo images) as our implementation example
It is in the grayscale format with size 512 x 512 Figure 5.7. This image is taken from aircraft
with a camera on it at different times so the camera takes the pictures from two different
positions.












5.1.1.1. Feature Extraction:
a. The histogram of the left and right images is calculated and the number of pixels in
each pixel value is displayed as a bar Figure 5.8.












Left Image Right Image
Figure 5.7: The pair of stereo images that will be used to explain our algorithm
Left Image
Right Image
Figure 5.8 The histogram that used for segmenting the stereo image pair
The Enhanced Shape from Stereo System

b. Two threshold values in the histogram are selected to divide the image into three
groups each group contains different gray level region segments (in Figure 5.8 the value
will be between the columns 1,2 and 2,3 respectively). The resulting groups are displayed
in Figure 5.9


Left Image(Groups) 512 x 512
Right Image(Groups) 512 x 512

Figure 5.9 Dividing each image into 3 groups of regions
5. THE ENHANCED SHAPE FROM STEREO SYSTEM IN DISTRIBUTED GIS 70
c. A series of ( low, medium and high ) resolution images is created , they are called the
multiresolution images for each group, created by reduction in size by a factor of two
using Gaussian convolution filter, Figure 5.10 and Figure 5.11.






























Low resolution (groups) 128 x 128
Medium resolution (groups) 256 x 256
High resolution (groups) 512 x 512
Figure 5.10 Creating the multiresolution images for the left image
Low resolution (groups) 128 x 128

Medium resolution (groups) 256 x 256

High resolution (groups) 512 x 512

Figure 5.11 Creating the multiresolution images for the right image
The Enhanced Shape from Stereo System

5.1.1.2. Morphology Filter:


After creating a binary images the open operator of the morphology filter is applied on the
low resolution images for each group so that the small region segments that may incur
errors are removed from the image Figure 5.12 and Figure 5.13.






































Low resolution (groups) 128 x 128
Morphology filter (open operator)
Binary
Filtered
Figure 5.12 Applying morphology filter on the left image
Low resolution (groups) 128 x 128

Morphology filter (open operator)
Binary
Filtered
Figure 5.13 Applying morphology filter on the right image
5. THE ENHANCED SHAPE FROM STEREO SYSTEM IN DISTRIBUTED GIS 72
5.1.1.3 Image Resizing:
The binary images are resized to the original size (512 x 512) with bilinear interpolation
Figure 5.14 and Figure 5.15











































Bilinear interpolation
128 x 128
512 x 512
Figure 5.14 Restoring the left image to its original size
Bilinear interpolation
128 x 128
512 x 512
Figure 5.15 Restoring the right image to its original size

The Enhanced Shape from Stereo System

5.1.1.4. Search Window-size calculations using (Optical Flow)


Using optical flow algorithm for Lucas and Canade [36] ,We've applied this algorithm on
our images. Then we got the window size to make our matching using it. This idea comes
from the fact that every object in the stereo pair change its position from the image to the
other. The optical flow of the nearest object will be obtained to get the maximum change in
position. We know that the nearer the object the more movement it will make.
After applying the algorithm of The optical flow of the images ( the flow is from right to left
) we will get the result shown in Figure 5.16.














We will notice that the flow of intensity is represented by arrow, the length of this arrow
identify how much this pixel transferred from the left image to the right one.
So we selected the maximum value of the arrow lengths and used it as a size of our
matching window.






Figure 5.16 The optical flow
5. THE ENHANCED SHAPE FROM STEREO SYSTEM IN DISTRIBUTED GIS 74
5.1.1.5 Feature Based Matching:
After resizing the images (3 groups) to their original size we will use them in our matching
process. Using the region segments as the feature elements, the first group of regions of the
right image will be matched with the first group of regions of the left image, and this
procedure will be applied to the second and third groups for both images Figure 5.17.



















a. In the right image find the following properties for each region (centroids, area,
orientation, aspect ratio and perimeter) so we have five properties for each region.
b. By selecting the first region in the right image and identifying its centroid, the centroid
is the center of the region , this position will then be transferred to the left image.
c. Around this position we will draw ( hypothetical ) window Figure 5.18, the size of this
window is previously calculating using the optical flow
match
match
match
Figure 5.17 Matching each group from the right with its corresponding group from the left
The Enhanced Shape from Stereo System

d. Using the properties of the regions in the window only and the properties of the current
selected region in the right we make a voting operation.











Assume we have a region Sr1 on the right image and (Sl1, Sl2, Sl3, Sl4, Sl5, Sl6) on the
left image , the voters will say their decision about which of the six regions is the nearest
one to Sr1, voters will be (centroid, area, orientation, aspect ratio and perimeter).
Centroid: comparing the centroids of the left regions and the selected region in the right,
and getting the absolute value of the difference between the centroids of the
selected region and the left regions we can identify which region is the nearest
one to the right region.
say in our example that the nearest one to Sr1 is Sl3 so the centroid voter votes
saying " I vote for Sl3".
Area: assume the nearest one in the area to Sr1 is Sl4 so the area voter voted saying " I
vote for Sl4".
Orientation: assume the nearest one in the orientation to Sr1 is Sl3 so the orientation
voter votes saying "I vote for Sl3".
Aspect Ratio: assume that the aspect ratio of Sl3 region is the nearest one to that of Sr1
so the aspect ration will vote for Sl3.
Perimeter: assume that the perimeter of Sl1 is the nearest one to that of Sr1 so the
perimeter will vote for Sl1.
Finally , we make a voting vector for the above votes [Sl3 Sl4 Sl3 Sl3 Sl1] and select
the most elected region or the maximum number of appearance of the region in
Figure 5.18 Using window in feature-based matching
5. THE ENHANCED SHAPE FROM STEREO SYSTEM IN DISTRIBUTED GIS 76
the voting vector. in our example it will be Sl3, As a result this region will be the
elected region for Sr1.

e. The above procedure is repeated until we finish all the right regions , i.e. we finish
identifying which region on the left belongs to that on the right,
with some exceptions:
e.1. there will be regions on the right image that has no correspondent region on the
left image, i.e. the window is empty, this region will be marked as "occluded
region".
e.2. the regions on the left that is not selected by any of the regions on the right will be
considered as " occluded regions".

5.1.1.6 Disparity Map (Not the final ) :
First group matching:
After assigning a region from the left image to a region on the right image, we get the
disparity map as in the following example:
if region Sl on the left image is assigned to region Sr on the right image as shown in Figure
5.19, then the disparity for each pixel in region Sr will be the difference between the two
centroids for the two regions, the pixels that is not a region pixel will have 0 as a value for
their disparity.











Disparity of Sl
Centroid of Sr
Centroid of Sl
Sl Sl
Sr
Left Image Right Image
Figure 5.19 The disparity for each region
The Enhanced Shape from Stereo System

Further groups matching:


So far, we have the disparity map for the first group only, we repeat the above algorithm on
the three groups, so we have now three disparity maps the final disparity map is calculated
as follow:
a. the disparity map of the first group will be compared with the disparity of the second
group ( comparison will be pixel by pixel ) .
a.1 if the two pixels has a value of 0 then the resulting pixel will have a value of 0 as
its disparity value.
a.2 if one of them has a value > 0 and the other has a value of 0 then the resulting
pixel will has the ( > 0 ) value its disparity.
a.3 if the two pixels has a value > 0 , then the disparity of the resulting pixel will be
the average value of them.
b. The resulting disparity map of the previous step is compared with the disparity map of
the third group and the above procedure is applied to get the whole disparity map.
For the pentagon stereo example a 3D representation of the disparity map (not the final ) is
shown in Figure 5.20 and Figure 5.21, We will notice that the higher points take brighter
color than the lower points , we know that the disparity express the depth values for each
feature ( region ) , so each region has the same disparity or depth.














Figure 5.20 Disparity map obtained from feature-based matching
5. THE ENHANCED SHAPE FROM STEREO SYSTEM IN DISTRIBUTED GIS 78












5.1.2. Stage 2, The Search Range:
5.1.2.1 Center of search range (CSR) :
It is calculated by adding a column index matrix ( as shown in Figure 5.22) to the disparity
matrix which calculated from stage1.




If we add disparity for a pixel on the left to the index matrix this will give us the
corresponding pixel in the right that will be used as Center of Search range for the left pixel.








Figure 5.22 A sample of the column index matrix
Figure 5.21 3D Model of the feature-based disparity map
The Enhanced Shape from Stereo System

5.1.2.2. Interval of Search Range(SR):


For the first group: we will use the -resized to the original image size- images, i.e. (low,
medium and high ) versions of the right image as shown in Figure 5.23





















Logic operations: as we explained later that if we compare the regions in (low medium and
high ) resolution images we can easily construct four different sub region,
1 SR1 : the regions appear in all the three images (3 images ANDed )
2 SR2 : the regions appear in Low and Medium images only, appeared as gray in the image
3 SR3: the regions appear in low resolution only appeared as white in the image
4 Background pixels
Logic
Operations
Low
512 x 512
Medium
512 x 512

High
512 x 512

Search Range 1
512 x 512

Figure 5.23 Obtaining the interval of search range
5. THE ENHANCED SHAPE FROM STEREO SYSTEM IN DISTRIBUTED GIS 80
Interval of search range:
The search range of the first group will be
SR(1) = Background * 4 + SR1 * 8 + SR2 * 16 + SR3 * 32;
Since the more the pixel near the edge of the region the more search range it will take, to
detect the abrupt change in the intensity.
- the logic operations will be applied on the 3 resolution images of the second group to get
SR(2) , also will be applied on the third group to get SR(3)
- the final interval of search range will be the average of the search ranges for each group.
Figure 5.24 shows a sample.
















From the Figure we conclude that the interval of search range of the pixel in the right image in the
first row and the first column will be 5, so when we search for the matched pixel in the left image
the range of the search will be 5 pixels, every pixel in the image is assigned a value for its Interval
of search range
Figure 5.24 A sample of the interval of search range values (SR)
The Enhanced Shape from Stereo System

5.1.3 Stage 3, The adaptive window size:


First: To get the adaptive window size we will get the details coefficients of the wavelet
transform of the right image Figure 5.25.






















Level 1
256 x 256
Level 2
128 x 128
Level 3
64 x 64
Figure 5.25 Results from applying 3-levels wavelet transform.
5. THE ENHANCED SHAPE FROM STEREO SYSTEM IN DISTRIBUTED GIS 82
Second: after applying 3-levels wavelet transform, the edges for each level are obtained. By
applying suitable threshold to each level and apply the 'Sobel' operator to get the edges, see
Figure 5.26.




























Level 1
256 x 256
Level 2
128 x 128
Level 3
64 x 64
Figure 5.26 Applying the edge detector on each level of the wavelet transform results
The Enhanced Shape from Stereo System

Third: in each level we OR all the detailed coefficients, then we resize all images to the
original image size, i.e. 512 x 512 , Figure 5.27


























Finally: since we want to identify for each pixel if it appears as an edge pixel in each
level or not, so we will take the pixels which appears as an edge pixel in the original image,
then we will try to detect if it appears as edge pixel in each level or not.
Left Image (Edges)
512 x 512
Level 1 (Edges)
512 x 512
Level 2 (Edges)
512 x 512
Level 3 (Edges)
512 x 512
Figure 5.27 Combining the edges for each level of the wavelet transform results
5. THE ENHANCED SHAPE FROM STEREO SYSTEM IN DISTRIBUTED GIS 84
The rule is : the more the pixel appears as an edge pixel in all levels, the less window size it
will take.
a- For a background pixels will set window size = 8 and edge pixels will have 4 as their
window size,
b- If the pixel appears again in level1 as an edge pixel then its strength as being edge will
increase and the window size will decrease, we will decrease it by one, it becomes 4 - 1= 3.
c- If the pixel appears again in level2 as an edge pixel then the window size will be
decreased by 1.
d- If the pixel appears again in level3 as an edge pixel then the window size will not be
changed., Figure 5.28 shows a sample
















In Figure 5.28 every pixel is assigned a window size that will be used in matching, since we know
that we can't match the image pixel by pixel instead we should match it window by window, the
size of the window is size by which the search range will be divide.
Figure 5.28 A sample of the calculated window size.
The Enhanced Shape from Stereo System

5.1.4. Stage 4, Area Based matching:


5.1.4.1 the matching process:
So far we have three matrices with 512 x 512 dimensions, Center of search range ( CSR ),
Interval of search range ( SR ) and the Adaptive window size (W) extracted from the right
image.
the algorithm of matching process will go like this :

























For each pixel in right Image
- Get the current pixel at (current row , current column )
- row = number of rows of the image, col = number of columns
- If CSR ( row, col) > number of columns of the image
then set CSR ( row , col ) = number of columns of the image
- get the search range window SRw from the left image using CSR
and the interval of search range as the search range window
size.
- If W(row , col) > search range size
then W(row , col) = SR (row , col ) / 2
- Get the right window wr, it will be a window by W(row, col)
size around the current pixel.
- If All pixels in the search range are EQUAL the current pixel
value
Then the old disparity will not be changed.
Else If All pixels in the search range are DIFFERENT from the
current pixel value.
Then the old disparity will not be changed.
Else

Loop to cut out the matching window ( left window ) from
the search range window.
- Get the left window wl
- Match wl and wr ( the following steps )
if the value of the pixel in wl - which correspond
to the current pixel position in the right window
is the same value as the current pixel.
Then this means the two windows are primarily
matched by one pixel only but we will get the
matching power.
The matching power= number of matched pixels +
similarity of the averages of wr and wl.
Else this means we have to calculate another kind of
matching criteria
the matching power = number of matched pixels +
similarity of averages + similarity of vector of
averages for each window (wr and wl ).
Slide one pixel and get the next left window wl Back to Loop

Get the matched window it will be the window with the
maximum matching power.
Get the disparity of the current pixel to be the difference
between current pixel position in both wr and wl.
Slide to the Next pixel in right image. Back to For Each

5. THE ENHANCED SHAPE FROM STEREO SYSTEM IN DISTRIBUTED GIS 86
5.1.4.2 Applying the mean filter:
We perform the refinement of the disparity map by applying the mean filter in order to
satisfy the continuity constraints, Figure 5.29.







5.1.5. Stage 5, the 3D model
We have now the depth or the third dimension, so we can get a complete 3D model of our
scene using the image pair and the depth map or disparity map, Figure 5.30.






Figure 5.29 The final disparity map of our algorithm
Figure 5.30 The 3D model obtained from the final disparity
map
2 Experimental Results

5.2. Experimental results


In this section we are going to introduce the image acquisition techniques and devices that
we have used to collect our dataset and then we will display the enhanced shape from stereo
system results.
5.2.1 Image Acquisition Techniques
5.2.1.1 Stereo Images using The Object Registration device
We have used the Object registration device to collect most of our dataset. The Object
Registration is a device used to take a stereo pair of an object to be reconstructed into a 3D
model. We can take more than one image for the same view. By identifying the zoom
properties for each camera and specifying the distance between the pair of cameras we can
easily identify the whole parameters that we want to know to take a prefect stereo pair.




















Figure 5.31 The Object Registration device. Source: Mubarak City for Research, Informatics
Research Institute, Alexandria, Egypt.
The camera
The object
5. THE ENHANCED SHAPE FROM STEREO SYSTEM IN DISTRIBUTED GIS 88
Here we can use one camera to take our stereo pair instead of two. The idea is to take the
first picture of the view then we move the camera using the moving parts of the object reg.
notice the arrows in Figure 5.31 -. Now the camera position is changed and we can easily
take another image of the same view.
The object reg. has the feature of interfacing with a personal computer using the serial port.
There is software used to make the interfacing. We can control the movements of the
different parts of the device using this software. We can also connect the camera that we use
to the computer to take the images to our PC instantly.
The software can work on Windows operating system or Apple Macintosh systems. Object
reg. has a big rounding table not indicated in the Figure 5.31- to put multiple objects on it
The rounding table is carried on the axe that carry the object as shown in Figure 5.31.
5.2.1.2 Stereo Images Using an Aircraft:
The images from an Aircraft are pair of aerial photos. A camera on board an aircraft takes
pictures of the Earth at different times and thus from different positions, Figure 5.32 and
Figure 5.33(a).











5.2.2 Results of the Shape from Stereo System
To make the geographical data in distributed GIS available in both 2D and 3D formats, we
are going to use the traditional Shape from Stereo technique to reconstruct the available 2D
images and convert it to the 3D model of the required scene or area. We are going to use
images in stereo pairs Figure 5.33. We are going to use the images from satellite and from
the Object Registration device.
Figure 5.32 Aircraft Stereo images.
2 Experimental Results


Figure 5.33 Stereo Image pairs (a) Pentagon satellite image pairs, (b) tsukuba stereo pairs, (c)
synthesis stereo image pairs for a house (d) the outdoor scene for a tree taken as stereo pair.
(a)
(b)
(c)
(d)
5. THE ENHANCED SHAPE FROM STEREO SYSTEM IN DISTRIBUTED GIS 90
Table 5.1 lists the stereo image pairs and their properties

Image
Name
Size Color Type Source
Pentagon 512 x 512 gray Stereo satellite images
Stereo image library
http://vasc.ri.cmu.edu.idb
Tsukuba 288 x 384 gray Standard stereo scene
Stereo image library
http://vasc.ri.cmu.edu.idb
House 250 x 250 gray Synthesis stereo scene
Stereo image library
http://vasc.ri.cmu.edu.idb
Tree 233 x 256 gray Outdoor stereo scene Object Registration Device
Table 5.1 : The stereo images and their properties
The matching process is divided into feature-based matching and area-based matching, the
area-based depends entirely on feature-based because the output from feature based will be
delivered to the area-based method to create more dense 3D model.
5.2.2.1 Feature-based Correspondence:
Feature-based method starts by extracting the features form both images. The features are
regions. The matching algorithm finds properties for each region (area, centroid, orientation,
contour, etc.) And then make the matching process using these properties. We have
extracted regions using histogram segmentation and then multiresolution images are created
using the Gaussian convolution filter, Table 5.2, which will be used in finding the search
range for each pixel in area-based matching procedure.

Image Name High Resolution Medium Resolution Low Resolution
Pentagon 512 x 512 256 x 256 128 x 128
Tsukuba 288 x 384 144 x 192 72 x 96
House 250 x 250 125 x 125 62 x 62
Tree 233 x 256 116 x 128 58 x 64
Table 5.2 the multiresolution images
We have used the low resolution images. The morphology operator is applied to remove
small regions which may incur error, and resized it again to the high resolution images.
2 Experimental Results

Now we have the regions ready for the matching procedure. A region in the right image is
selected and matched by all the regions in the left image. Of course this increases time and
reduce accuracy. For a single region we may find multiple regions have the same properties
for this region, so the question is which region is the best one for the matching process?
This error is called the one-to-many correspondence error.
The errors of the matching process come from the following:
1 The number of the occluded regions:
The occluded regions are the regions which appear in one image only and not appear in the
other. When we take stereo photos, we know that the images are taken from different
positions, so some objects may be hidden in one image and appear in the other and vice
versa Figure 5.34. These regions affect the errors, because when we are going to match an
occluded region, we find no match for it, because the fact is that the occluded region has no
match, since it appears once in one of the stereo image pair. The occluded region also may
appear from an error in the segmentation.
The algorithm may match the occluded region in an error way, this of course will affect the
number of the matching errors.










2 The one-to-many correspondence:
This arises when we find a single region has multiple candidate regions for matching. The
algorithm is confused about which region to select for the matching, so the region may be
selected for example for first one, but unfortunately this is not the correct region. This error
is called the mismatching error; this error affects the smoothness of the created 3D model,
since the mismatching come from assigning regions in a random way which affects the
Figure 5.34 The tsukuba stereo image pair, figure 5.33(b) after segmentation showing the occluded
regions
5. THE ENHANCED SHAPE FROM STEREO SYSTEM IN DISTRIBUTED GIS 92
continuity constraint of the created 3D model. Figure 5.37(c) shows mismatching which
results in heights which are not exist in the real world.
Now we are going to show how the number of the occluded regions affects the accuracy.
Table 5.3 shows the results of applying our algorithm on the stereo images in Figure 5.33.

Pentagon Tsukuba House Tree
Regions (left) 93 31 22 46
Regions (right) 110 33 20 60
Occluded regions 17 2 2 14
% Occluded regions 15 % 6 % 9 % 23 %
Mismatches 21 4 3 12
% Mismatches 19 % 12 % 13 % 20 %
True matches 72 27 17 34
Accuracy 65 % 82 % 72 % 56 %
Table 5.3: Results of applying feature based matching on stereo images

- The occluded regions = | Regions (Left) Regions (Right) |
- The percent of occluded regions = Occluded regions / Maximum (Regions (Left),
Regions (Right)) * 100.
- Mismatches = Number of multiple assigned regions (calculated within the
algorithm)
- The percent of Mismatches = Mismatches / Maximum (Regions (Left), Regions
(Right)) * 100.
- True matches = Maximum (Regions (Left) , Regions (Right)) Occluded regions
Mismatches.
- Accuracy = True matches / Maximum (Regions (Left) , Regions (Right)) * 100 .

From Table 5.3 and Figure 5.35 we find that the number of the occluded regions affects
directly the accuracy. When the number of the occluded regions increases, the number of
mismatches increase, and consequently the accuracy decreased, with lower number or
absence of the occluded regions the accuracy is maximized.
2 Experimental Results

















Figure 5.36 shows the accuracy of creating 3D models form the images in Figure 5.33 after
applying the feature-based matching procedure.













Figure 5.36 The accuracy of the feature based matching procedure when applied on the images
in figure 5.33
Figure 5.35 The relation between the occluded regions and the accuracy
5. THE ENHANCED SHAPE FROM STEREO SYSTEM IN DISTRIBUTED GIS 94
Notice that the errors of the matching of features will affect the performance of the area-
based matching, since area-based matching depends on feature-based matching.



















Figure 5.37 shows the 3D models of the images in Figure 5.33, it is the results of applying
feature-based matching procedure. The figure shows a great amount of errors which affect
the continuity of the model.
Also the density of the created model is very low, since we have matched only for features.
To create a more dense model, we have to follow the feature-based matching by the area-
based matching for more dense models.

5.2.2.2 Area-based Correspondence:

The area-based correspondence should follow the feature-based correspondence for creating
more dense 3D models. The correspondence now will be made pixel by pixel. A window
Figure 5.37 3D models for (a) pentagon (b) tsukuba (c) house (d) tree after applying the
feature-based stereo matching technique.
(a) (b)
(c) (d)
2 Experimental Results

around the pixel is created and the matching will be window by window. The results of the
feature-based matching will be used to get the search range. The search range is used to
focus the search for the matched pixel or window in its range. The search range is obtained
by identify a center for the search range and a size of the search range. The results for
applying the area-based matching with fixed search range and window-size after the feature-
based matching is shown in Table 5.4, Figure 5.38 and Figure 5.39.

Pentagon Tsukuba House Tree
Search Range 12 12 12 12
Window Size 3 3 3 3
Total Matches 1048576 1048576 1048576 1048576
True Matches 702545 880803 786432 597688
Mismatches 346030 167772 262135 450887
Accuracy 67 % 84 % 75 % 57 %

Table 5.4: Results of applying area-based matching on stereo images with fixed window sizes
















Figure 5.38 The accuracy of the feature-based matching procedure followed by area-based
matching procedure when applied on the images in figure 5.33
5. THE ENHANCED SHAPE FROM STEREO SYSTEM IN DISTRIBUTED GIS 96


















Comparing the results of Figure 5.36 and Figure 5.38 we find that the created model is more
dense in area-based rather than feature-based matching, Note also the continuity constraint
is solved using the area-based matching, since no abrupt change in heights, in other words
the 3D model is more smooth when created using the area-based matching than the feature-
based matching models.

Figure 5.39 3D models for (a) pentagon (b) tsukuba (c) house (d) tree after applying the
feature-based stereo matching followed by area-based stereo matching techniques.
(a) (b)
(c) (d)
2 Experimental Results

5.2.3 Results of the enhanced shape from stereo system


In order to increase the accuracy of the matching process or reduce the mismatches in the
matching process. We have used the adaptive window approach in both feature-based
matching and area-based matching, which helped us to search not the entire image for the
matched pixel or region; instead the search is focused on the interested features or pixels.
The size of the window is defined from the image data. Of course this will affect the
performance of the algorithm in two ways
1 The one-to-many correspondence will be reduced and the number of mismatches will
also be reduced, because we are not going to search the entire image instead we search in a
specified window.
2 The time of the matching will be decreased since the search time will be reduced and
this will affect the overall matching time.
So we expect to have a 3D model with high accuracy and high speed matching time.
5.2.3.1 Feature-based Correspondence:
We have used the results form the optical flow to decide what is the best window size for
each stereo image pairs.
Table 5.5 shows the results of our approach after applying the adaptive window procedure
on the images of Figure 5.33
Pentagon Tsukuba House Tree
Window-Size 80 pixels 26 pixels 35 pixels 29 pixels
Regions (left) 93 31 22 46
Regions (right) 110 33 20 60
Occluded regions 17 2 2 14
% Occluded regions 15 % 6 % 9 % 23 %
Mismatches 9 1 1 6
% Mismatches 8 % 3 % 4 % 10 %
True matches 84 30 19 40
Accuracy 76 % 90 % 86 % 66 %
Table 5.5 : Results of feature based matching applied on stereo images.
From Table 5.5 we find that the accuracy is increased due to the reduction of mismatches
results from using the adaptive window approach, A proper window-size is calculated for
each image pairs. The window size is calculated using the optical flow of the image pair.
5. THE ENHANCED SHAPE FROM STEREO SYSTEM IN DISTRIBUTED GIS 98
Figure 5.40, Figure 5.41 shows the enhancements in the accuracy which we have made
using the proposed adaptive window approach.


























Figure 5.41 3D models for (a) pentagon (b) tsukuba (c) house (d) tree after applying the
Feature-based matching using our adaptive window approach.
(a) (b)
(c) (d)
Figure 5.40 The Enhancements in the accuracy that we have made due to the adaptive window
approach in the Feature-based matching, Gray bars represent the accuracy before applying the
adaptive window; Black bars represent the accuracy after applying our proposed adaptive-
window approach.

2 Experimental Results

5.2.3.2 Area-based Correspondence:


The problem with window-based correspondence is selecting a proper size for the window.
If the correlation windows are too small, the intensity variation in the windows will not be
distinctive enough, and many false matches may occur. If they are too large, the resolution
is lost and many false matches will be obtained. Table 5.6 and Table 5.7 show results for
applying multiple window size and search ranges on the Pentagon stereo pairs.
Search
Range
Window
Size
Total Matches
True
Matches
Mismatches Accuracy
6 3 524288 424673 99614 81 %
9 3 786432 684196 102236 87 %
12 3 1048576 964689 83877 92 %
15 3 1310720 1153433 157286 88 %
18 3 1572864 1368391 204472 87 %
21 3 1835008 1468006 367002 80 %
24 3 2097152 1488977 608174 71 %
27 3 2359296 1368391 990904 58 %
30 3 2621440 1179684 1441792 45 %
33 3 2883584 1066926 1816657 37 %
36 3 3145728 1038090 2107637 33 %
39 3 3407872 886046 2521825 26 %
42 3 3670016 807403 2862612 22 %

Table 5.6: Results of applying multiple size search range and fixed window size on Pentagon images

Search
Range
Window
Size
Total Matches
True
Matches
Mismatches Accuracy
21 3 1835008 1468006 367002 80 %
21 9 611669 538268 73400 88 %
21 12 458752 417464 41287 91 %
21 15 367001 154140 212860 42 %
21 18 305834 110100 195733 36 %

Table 5.7: Results of applying multiple window sizes with fixed search range on Pentagon images

5. THE ENHANCED SHAPE FROM STEREO SYSTEM IN DISTRIBUTED GIS 100
































Figure 5.43: The relation between the window size and the matching accuracy.
Figure 5.42 The relation between the search range size and the matching accuracy.
2 Experimental Results

From the Figure 5.42 and 5.43 we see that when we use a relatively small search range and
window size we find an increasing number of errors. Also when we use large window size
and search range, we found an increasing number of errors. There is a region on the curve
represents the most suitable size for each search range and window size. It is calculated by
trail and error for each application, when we find the most suitable size we apply it on the
entire image in fixed size.

Instead we have proposed a new technique to compute the most suitable window size
automatically for each part of the image and we called it the adaptive search range and the
adaptive window size.
The information in the image is used to decide what the most suitable window size is. We
have used two types of adaptive widow
Adaptive Search Range Size
Adaptive Window Size
The algorithm uses the multiresolution images to decide what the most suitable search range
is. The search range will increase at the regions in the image that have a high variety of
intensity to detect the abrupt change in the intensity, in the regions like the background
regions we will use smaller search range.

For the window size, we will use the information from the wavelet transform, when the
pixel arises as an edge pixel in all levels of the wavelet filter, in this case we are sure that
the pixel is an edge pixel, and we will assign a low window size for it, when the pixel arises
in levels and hidden in other levels, we will increase the window size for it.

Figure 5.44 shows how the search range and window size is calculated from the image
intensity profile.
The change in the image intensity causes the search range to be high, with background
areas; we notice a small search range.
For the window size, if the pixels are edge pixels the window size is reduced and with
background areas we notice the window size increased.
5. THE ENHANCED SHAPE FROM STEREO SYSTEM IN DISTRIBUTED GIS 102
































Figure 5.44 The search range and window size are calculated form information about the
intensity variances in the image.
2 Experimental Results

Table 5.8 and Figure 2.45 show the results for the area-based stereo matching after applying
our proposed adaptive window approach.

Pentagon Tsukuba House Tree
Search Range adaptive adaptive adaptive adaptive
Window Size adaptive adaptive adaptive adaptive
Total Matches 1168342 811547 414390 724114
True Matches 934673 762854 364663 506879
Mismatches 233668 48692 49726 217235
Accuracy 80 % 94 % 88 % 70 %
Table 5.8: The accuracy of the area-based matching procedure after applying the proposed
adaptive window approach.














Figure 5.46 shows the 3D models for the stereo images in figure 5.33 after applying our
proposed adaptive window approach showing how the accuracy increased. Of course this
will decrease the time of matching, and the continuity constraint is also solved with having
more smooth 3D models.
Figure 5.45 The Enhancements in the accuracy that we have made due to the adaptive window
approach in the area-based matching, Gray bars represent the accuracy before applying the
adaptive window based matching; Black bars represent the accuracy after applying our
proposed adaptive-window approach.
5. THE ENHANCED SHAPE FROM STEREO SYSTEM IN DISTRIBUTED GIS 104



















5.3. Summery
This chapter explored our proposed enhanced shape from stereo system. The stages of the
system start by extracting the image segments or features from the pair of stereo images;
this extraction is done using the histogram segmentation. The correspondence or the
matching operation is done between both images; this operation is called the feature-based
matching. It must be followed by a more dense matching operation in which the matching is
done pixel by pixel; this operation is called the area-based correspondence.
The results of our enhanced shape from stereo system show an enhancement (about 11 %)
of the accuracy over the traditional shape from stereo system. We proposed a new technique
for solving the correspondence problem; it is called the adaptive window technique. In this
technique the correlation window size is calculated automatically using the image data.
Figure 5.46 3D models for (a) pentagon (b) tsukuba (c) house (d) tree after applying the
feature-based stereo matching followed by area-based stereo matching techniques using the
proposed adaptive window approach
(a) (b)
(c)
(d)
CHAPTER 6

CONCLUSIONS AND FUTURE WORK

6.1. Conclusions
6.1.1 3D GIS in network environment
To make the geographical data available in 2D and 3D, we have used the Enhanced shape
form stereo technique for converting the 2D images to 3D models.
When we need to view the geographical data, we can view it as 2D image using the existing
images, these images is organized in image databases, we also can view it in 3D, this can be
done using our Enhanced shape from stereo system which converts the existing 2D image
data to 3D models, the result is cached for further requests to the same model, when we
need to view data in 3D, first the cached data is searched, if the required 3D model is not
existed, the Enhanced shape from stereo system is used to convert the 2D image to 3D
model.
The databases of our system are image database, or a collection of stereo image pairs
available for the user in an archive. Our system can be applied on this archive after
extracting the images that suitable for the user request. The processing that applied on the
image to extract the third dimension can be done on the server side, and then the 3D map is
delivered to the user. The image database can be supported using images from satellite or
aircraft it can be used like a store for the stereo image pairs.
6.1.2 Enhanced Shape from Stereo system.
The shape from stereo system uses pair of images. It matches the pair of images to extract
the disparity. This disparity represents the depth for each object in the image.
Traditional shape from stereo system solves correspondence problem using area-based and
feature-based matching procedures. In these procedures we search for the matched feature
6. CONCLUSIONS AND FUTURE WORK 106
in the entire image. Of course this increases the time and reduces the accuracy. We have
applied the traditional shape from stereo on many stereo image pairs like Pentagon,
Tsukuba, House, and Tree stereo image pairs, We found that there is a huge amount of
errors in the matching process, The accuracy of the matching process is 67 % for the
Pentagon, 84 % for Tsukuba, 75 % for House, and 57 % for the Tree stereo image pairs.
The need for a new technique to increase the accuracy arises.
So we have proposed the Enhanced Shape from stereo system, in which the
correspondence or the matching problem is solved using a new technique called the
adaptive-window technique. The adaptive window technique is used in both feature-based
and area-based matching procedures. When we make the matching process, we search in a
specified window instead of searching the entire image; this resulted in a reduction of the
mismatching error and the one-to-many assignment error. The mismatching error is reduced
from 19 % to 8 % for the Pentagon, from 12 % to 3 % for the Tsukuba, from 13 % to 4 %,
and from 20 % to 10 % for the Tree stereo image pairs.
The results show that when we used a very small window, the matching error is increased,
also when we used a large size window, the error is increased, so instead of assigning a
fixed window size for the search, we proposed the adaptive window technique, in which the
search window size is calculated from the information in the image.
When we applied the Enhanced shape from stereo system including our proposed
adaptive window technique on the Pentagon stereo pairs, we found that the accuracy is
increased from 67 % to 80 %, from 84 % to 94 % for the Tsukuba, from 75 % to 88 % for
the House, and from 57 % to 67 % for the Tree. In other words we have increased the
accuracy about 11 % after applying the Enhanced shape from stereo system.
The Enhanced Shape from stereo system introduced high accuracy in creating the 3D
models from 2D images. The errors of matching are minimized, so the 3D models can be
distributed on the computer networks with high accuracy and low processing time.

6.2. Future Work
The use of computer vision in the Distributed GIS field is an interesting field. Many
applications need to monitor the land areas remotely by sending aircraft for example to the
Future Work

interesting area, and install a pair of cameras on them to collect the data in 3D format, rather
than 2D.
For example if the collected data need to be analyzed with respect to the heights and low
areas and then distributed on different terminals. This application can be used in the military
areas; navy plans can also use the same application.
We are going to search for a strong and efficient methods and procedures of creating 3D
models, In this case we are not going to use a pair of images, instead we will use a pair of
video stream, and our methods will handle each pair of frames in the stream to create online
3D models of the scanned areas.
To cerate a 3D model from video or frames, we are concerned in a problem called the
matching or correspondence problem. The important factor in matching process is the errors
or false matches which come from the matching constraints, so future work will concentrate
in reducing such errors.
We will use the matching using features; we will use new types of features like line
segments, corners, circles, etc. instead of using the regions only. We will try also to use
different kinds of segmentations that used to extract features from images.
Reducing the search area for a matched feature will reduce time consequently, so we can
use our algorithm in applications that need less time like the video stream or consecutive
video frames that mentioned above. The algorithm will be applied on robots. We know
robots take a stereo pair of video frames and the time is important, so we can match each
pair of video frames which are taken at the same time.

References:
[1] Heath A. James, Kenneth A. Hawick, Paul D. Coddington "Scheduling in
Metacomputing Systems, Distributed GIS ",Department Of Computer Science
University of Adelaide, 1999.

[2] Ankit Africawala, Juliana Castillo, Janis Schubert," 3D GIS Technology Assessment
Report", GIS management and implementation GISC/POEC 6383, Oct 30, 2003

[3] Volker Coors "3D-GIS IN NETWORKING ENVIRONMENTS" Fraunhofer Institute
for Computer Graphics, Germany, 2002

[4] Mark J. Carlotto "Shape from Shading", website content
http://www.newfrontiersinscience.com/martianenigmas/Articles/SFS/sfs.html ; last
updated Oct. 15, 1996.

[5] Angeline M. Loh "The recovery of 3D structure using visual texture patterns" ,PhD
thesis, February 2006

[6] Helmut Cantzler "An overview of shape-from-motion" , School of Informatics
University of Edinburgh, 2003

[7] Ron kimmel, "3D Shape Reconstruction from Autostereograms and Stereo", Computer
science department, Technion, Haifa 32000, July 13 2000.

[8] Bob Fisher, "Shape from stereo vision" School of Informatics University of Edinburgh
Room 2107D James Clerk Maxwell Building The King's Buildings Mayfield Road
Edinburgh EH9 3JZ UK, 2004

[9] S. D. Cochran, G. Medioni, "3-D Surface Description From Binocular Stereo," IEEE
Trans. PAMI, vol. 14, no. 10, pp. 981-994, Oct. 1992.

References

[10] K. L. Boyer and A. C. Kak, "Structural Stereopsis for 3-D Vision," IEEE Trans. PAMI,
vol. 10, no. 2, pp. 144-166, Mar. 1988.

[11] S.Lacroix, I.-K. Jung, A. Mallet, " Digital elevation map building from low altitude
stereo imagery" , LAAS/CNRS, Toulouse Cedex, France, 2002

[12] Emanuele Trucco, Alessandro Verri "Introductory techniques for 3-D computer
vision", Prentice Hall.,1998

[13] M.Hatzitheodorou, E.A. Karabassi, G. Papaioannou, A.Boehm and T. Theoharis,"
Stereo matching using optical flow", American collage of Greece, Degree Collage,
Athens, Greece, Academic Press, 2000.

[14] Kyu-Phil Han, Kun-Woen Song, Eui-Yoon Chung, Seok-Je Cho, Yeong-Ho Ha,
"Stereo matching using genetic algorithm with adaptive chromosomes ", School of
Computer and Software engineering Kumoh Nat'l University, Kumi 730-701, South
Korea, 2001

[15] Kyu-Phil Han, Tae-Min Bae, Yeong-Ho Ha, " Hybrid stereo matching with new
relaxation scheme of preserving disparity discontinuity", School of Electronics and
Electrical Eng. Kyngpook Nat'l University, South Korea ,2000.

[16] G. Wei, W. Brauer, and G. Hirzinger, "Intensity- and Gradient-Based Stereo Matching
Using Hierarchical Gaussian Basis Functions," IEEE Trans. PAMI, vol. 20, no. 11, pp.
1143-1160, Nov. 1998.

[17] T. Kanade and M. Okutomi, "A Stereo Matching Algorithm with an Adaptive
Window: Theory and Experiment," IEEE Trans. PAMI, vol. 16, no. 9, pp. 920-932, Sep.
1994.

[18] C. Sun, "A Fast Stereo Matching Method,", pp. 95-100, Digital Image Computing:
Techniques and Applications, Massey University, Auckland, New Zealand, Dec. 1997.
References 110

[19] Robyn Owens "Shape from stereo vision "The University of Western Australia ,35
Stirling Highway Crawley, WA 6009, Australia , 1997

[20] C. Baillard, H. Maitre, " 3D Reconstruction of Urban Scenes from Aerial Stereo
Imagery: A Focusing Strategy" ENST, Department TSI, Paris, France,1999

[21] O. Faugeras, "Three-Dimensional Computer Vision; A Geometric Viewpoint," pp.
189-196, Massachusetts Institute of Technology, 1993.

[22] Bart Kuijpers "Spatial and Spatio-Temporal Data Models for GIS" Limburgs
Universitair Centrum,2005

[23] F. Escobar, G. Hunter, I. Bishop, A. Zerger "Introduction to GIS, GIS Data Models"
http://www.sli.unimelb.edu.au/gisweb/ Department of Geomatics, The University of
Melbourne, 2001

[24] Xiannong Meng, Richard Fowler " Bridging the Gap Between GIS and the WWW"
Department of Physics and Geology, University of Texas - Pan American, Edinburg, TX
78539-2999 ,2003

[25] Panjetty Kumaradevan, Senthil Kumar " Virtual Reality and Distributed GIS" Iowa
State University, 246 N Hyland Aprt # 309, Ames, Iowa 50014, 2004

[26] S. E. Umbaugh, "Computer Vision and Image Processing," pp. 125-130, Prentice Hall,
1998.

[27] John C. Russ "The IMAGE PROCESSING Handbook " third edition, Materials
Science and Engineering Department, North Carolina State University Raleigh, North
Carolina, 1999.

References

[28] Earth Sciences Sector of Natural Resources, Canada, Information available at their
website http://ess.nrcan.gc.ca/index_e.php, Last updated 2006.

[29] Gabriel Fielding, Moshe Kam, "Disparity maps for dynamic stereo", Data fusion
laboratory, Department of Electrical and Computer Engineering, Drexel University,
3141 Chestnut st. Philadelphia, PA 19104, USA, 1999

[30] M. Okutomi and T. Kanade, "A Multiple-Baseline Stereo," IEEE Trans. PAMI,
vol. 15, no. 4, pp. 353-363, Apr. 1993.

[31] Han-Suh Koo and Chang-Sung Jeong, "An Area-Based Stereo Matching Using
Adaptive Search Range and Window Size", Department of Electronics Engineering,
Korea University, Korea, 2001

[32] Shing-Huan Lee and Jin-Jang Leou, "A dynamic programming approach to line
segment matching in stereo vision," Pattern Recognition, vol. 27, no. 8, pp. 961-986,
1994.

[33] S. B. Maripane and M. M. Trivedi, "Multi-Primitive Hierarchical(MPH) Stereo
Analysis," IEEE Trans. PAMI, vol. 16, no. 3, pp. 227-240, Mar. 1994.

[34] Sudhir R Kaushik "Antialiasing Techniques" Department of Computer Science at the
Worcester Polytechnic Institute, 2001

[35] Frdo Durand "A Short Introduction to Computer Graphics" MIT Laboratory for
Computer Science, 1999

[36] Bruce D. Lucas, T. Kanade "An Iterative Image registration Technique with
Application to Stereo Vision" Computer Science Department, Carnegie-Mellon
University, Pittsburgh, Pennsylvania 15213, Proceedings of Imaging Understanding
Workshop, pp. 121-130 (1981).





.




.
.

) .(

.
-
.


.






.

.
:
:

.
:

.
:
.
:
.
:
. .
: .






--;---- --~- _-- -;-- --,=- '-;--- ;=-

-'~,- -- ~--- ;~-- - '--=-- Q-~ '-~'=- ;=-- ~-- ~=-- _-- ;~= ) ,--~'-- (
- ;-- ~~'=-


-'~,- ---
--)- / Q-- '~ _-=~- --=- -'~

;-,~--








. . . ,- ; Q-~ '~
;~- ~-- ~--)- -- '-~'=- ;=--
,V -'
. . ;-- -,- --=-
;~- ~-- ~--)- -- '-~'=- ;=--
,V -'
. . . >- --- --- --=- --
;~- ;-- ~-- ~--)- -- '-~'=-
;-- --,---( -' --;---

Potrebbero piacerti anche