Sei sulla pagina 1di 27

SEMINAR PROJECT TITLED

Picture Based Query Processing



Submitted
in partial fulfillment of
the requirements for the Degree of

Master of Computer Applications
(MCA)


Sonia Bhagwat
Roll No : 122011041
Under the guidance of
Prof. Ratna Karmakar








Department Of Computer Technology
Veermata Jijabai Technological Institute
(Autonomous Institute, Affiliated To University of Mumbai)
Mumbai 400019
Year 2013-2014
VEERMATA JIJABAI TECHNOLOGICAL INSTITUTE
MATUNGA, MUMBAI 400019


CERTIFICATE


This is to certify that the seminar report titled

Picture Based Query Processing
has been completed successfully
By

Sonia Bhagwat
Roll No. 122011041
Class: MCA-IV
in Academic year 2013-2014




Evaluator : Prof. Ratna Karmakar

Date : 7th May,2014

INDEX

1. Introduction
2. History of Image Retrieval
3. Content Based Image Retrieval(CBIR)
4. Applications of CBIR
5. CBIR Techniques
5.1 Query Techniques
5.2 Semantic Retrieval
5.3 Other Query Techniques
5.4 Color
5.5 Texture
5.6 Shape
6. CBIR Architecture
7. Sample Query
8. Feature Extraction
9. Existing CBIR Systems
10. Problem Statement With Proposed Solution
11. Conclusion
12. References





Introduction

This is a report based on picture based query processing i.e.
the retrieval of images from database based on the content of
images rather than the metadata such as text associated with the
images. The method used for this is Content Based Image
Retrieval(CBIR).
Very large collections of images are growing ever more
common. From stock photo collections to proprietary databases to
the Web, these collections are diverse and often poorly indexed;
unfortunately, image retrieval systems have not kept pace with the
collections they are searching. The shortcomings of these systems
are due both to the image representations they use and to their
methods of accessing those representations to find images.






History of Image Retrieval

An image retrieval system is a computer system for
browsing, searching and retrieving images from a
large database of digital images. Most traditional and common
methods of image retrieval utilize some method of adding
metadata such as captioning , keywords, or descriptions to the
images so that retrieval can be performed over the annotation
words. Manual image annotation is time-consuming, laborious
and expensive; to address this, there has been a large amount of
research done on automatic image annotation. Additionally, the
increase in social web applications and the semantic web have
inspired the development of several web-based image annotation
tools.

The traditional paradigm for the retrieving of images is
based on keyword annotation. In this approach, human experts
manually annotate each image with a textual description, so that
text-based information retrieval techniques can be applied. This
approach has the advantage of inheriting efficient technologies
developed for text retrieval, but is clearly impracticable for the
case of very large image DBs. Moreover, its effectiveness highly
depends on the subjective opinions of the annotators, who are
also likely to supply different descriptions for the same image.




Content Based Image Retrieval

Content-based image retrieval (CBIR), also known
as query by image content (QBIC) and content-based visual
information retrieval (CBVIR) is the application of computer
vision techniques to the image retrieval problem, that is, the
problem of searching for digital images in large databases
Content-based image retrieval is opposed to concept-based
approaches.

"Content-based" means that the search analyzes the
contents of the image rather than the metadata such as keywords,
tags, or descriptions associated with the image. The term
"content" in this context might refer to colors, shapes, textures, or
any other information that can be derived from the image itself.
CBIR is desirable because most web-based image search engines
rely purely on metadata and this produces a lot of garbage in the
results. Also having humans manually enter keywords for images
in a large database can be inefficient, expensive and may not
capture every keyword that describes the image. Thus a system
that can filter images based on their content would provide better
indexing and return more accurate results.

The term "content-based image retrieval" seems to have
originated in 1992 when it was used by T. Kato to describe
experiments into automatic retrieval of images from a database,
based on the colors and shapes present. Since then, the term has
been used to describe the process of retrieving desired images
from a large collection on the basis of syntactical image features.
The techniques, tools, and algorithms that are used originate from
fields such as statistics, pattern recognition, signal processing,
and computer vision.




















Applications of CBIR

There is a growing interest in CBIR because of the
limitations inherent in metadata-based systems, as well as the
large range of possible uses for efficient image retrieval. Textual
information about images can be easily searched using existing
technology, but this requires humans to manually describe each
image in the database. This is impractical for very large databases
or for images that are generated automatically, e.g. those
from surveillance cameras. It is also possible to miss images that
use different synonyms in their descriptions. Systems based on
categorizing images in semantic classes like "cat" as a subclass of
"animal" avoid this problem but still face the same scaling issues.
Potential uses for CBIR include:
Architectural and engineering design
Art collections
Crime prevention
Geographical information and remote sensing systems
Intellectual property
Medical diagnosis
Military
Photograph archives
Retail catalogs


CBIR Techniques
Many CBIR systems have been developed, but the
problem of retrieving images on the basis of their pixel content
remains largely unsolved.
Query techniques:
Different implementations of CBIR make use of
different types of user queries. Query by example is a query
technique that involves providing the CBIR system with an
example image that it will then base its search upon. The
underlying search algorithms may vary depending on the
application, but result images should all share common elements
with the provided example.
Options for providing example images to the system include:
A preexisting image may be supplied by the user or chosen
from a random set.
The user draws a rough approximation of the image they are
looking for, for example with blobs of color or general shapes.
[3]

This query technique removes the difficulties that can arise when
trying to describe images with words.
Semantic retrieval:
The ideal CBIR system from a user perspective would
involve what is referred to as semantic retrieval, where the user
makes a request like "find pictures of Abraham Lincoln". This type
of open-ended task is very difficult for computers to perform -
pictures of chihuahuas and Great Danes look very different, and
Lincoln may not always be facing the camera or in the same pose.
Current CBIR systems therefore generally make use of lower-level
features like texture, color, and shape, although some systems
take advantage of very common higher-level features like faces
(facial recognition system). Not every CBIR system is generic.
Some systems are designed for a specific domain, e.g. shape
matching can be used for finding parts inside a CAD-
CAM database.
Other query methods:
Other query methods include browsing for example
images, navigating customized/hierarchical categories, querying
by image region (rather than the entire image), querying by
multiple example images, querying by visual sketch, querying by
direct specification of image features, and multimodal queries
(e.g. combining touch, voice, etc.)
CBIR systems can also make use of relevance
feedback, where the user progressively refines the search results
by marking images in the results as "relevant", "not relevant", or
"neutral" to the search query, then repeating the search with the
new information.
Content comparison using image distance measures:
The most common method for comparing two images
in content-based image retrieval (typically an example image and
an image from the database) is using an image distance measure.
An image distance measure compares the similarity of two images
in various dimensions such as color, texture, shape, and others.
For example a distance of 0 signifies an exact match with the
query, with respect to the dimensions that were considered. As
one may intuitively gather, a value greater than 0 indicates
various degrees of similarities between the images. Search results
then can be sorted based on their distance to the queried image.
Color:
Computing distance measures based on color similarity is
achieved by computing a color histogram for each image that
identifies the proportion of pixels within an image holding specific
values (that humans express as colors). Current research is
attempting to segment color proportion by region and by spatial
relationship among several color regions. Examining images
based on the colors they contain is one of the most widely used
techniques because it does not depend on image size or
orientation. Color searches will usually involve comparing color
histograms, though this is not the only technique in practice.
Texture:
Texture measures look for visual patterns in images and
how they are spatially defined. Textures are represented by
texels which are then placed into a number of sets, depending on
how many textures are detected in the image. These sets not only
define the texture, but also where in the image the texture is
located.
Texture is a difficult concept to represent. The identification of
specific textures in an image is achieved primarily by modeling
texture as a two-dimensional gray level variation. The relative
brightness of pairs of pixels is computed such that degree of
contrast, regularity, coarseness and directionality may be
estimated. However, the problem is in identifying patterns of co-
pixel variation and associating them with particular classes of
textures such as silky, or rough.



Shape:
Shape does not refer to the shape of an image but to
the shape of a particular region that is being sought out. Shapes
will often be determined first applying segmentation or edge
detection to an image. Other methods like use shape filters to
identify given shapes of an image. In some case accurate shape
detection will require human intervention because methods like
segmentation are very difficult to completely automate









CBIR System






Sample Query









CBIR Architecture

Suppose user wants to search for, say, many rose images
o He submits an existing rose picture as query.
o He submits his own sketch of rose as query.
The system will extract image features for this query.
It will compare these features with that of other images in a
database.
Relevant results will be displayed to the user.




















Feature Extraction

Primitive features
Mean color (RGB)
Color Histogram
Semantic features
Color Layout, texture etc
Domain specific features
Face recognition, fingerprint matching etc


Mean Colour :

Pixel Color Information: R, G, B
Mean component (R,G or B)=
Sum of that component for all pixels /Number of pixels

Colour Histogram:

Frequency count of each individual color
Most commonly used color feature representation


Colour Layout:

Need for Color Layout
Global color features give too many false positives
How it works:
Divide whole image into sub-blocks
Extract features from each sub-block
Can we go one step further?
Divide into regions based on color feature
concentration
This process is called segmentation.








Texture:


Texture innate property of all surfaces
Clouds, trees, bricks, hair,etc
Refers to visual patterns of homogeneity
Does not result from presence of single color
Most accepted classification of textures based on psychology
studies Tamura representation as follows :
Coarseness
Contrast
Directionality
Linelikeness
Regularity
Roughness















Existing CBIR Systems

Available CBIR software
Despite the shortcomings of current CBIR technology, several
image retrieval systems are now available as commercial
packages, with demonstration versions of many others available
on the Web. Some of the most prominent of these are described
below.
Commercial systems
QBIR : IBMs QBIC system is probably the best-known of all
image content retrieval systems. It is available commercially
either in standalone form, or as part of other IBM products such
as the DB2 Digital Library. It offers retrieval by any combination
of colour, texture or shape as well as by text keyword. Image
queries can be formulated by selection from a palette, specifying
an example query image, or sketching a desired shape on the
screen. The system extracts and stores colour, shape and texture
features from each image added to the database, and uses R*-tree
indexes to improve search efficiency [Faloutsos et al, 1994]. At
search time, the system matches appropriate features from query
and stored images, calculates a similarity score between the query
and each stored image examined, and displays the most similar
images on the screen as thumbnails. The latest version of the
system incorporates more efficient indexing techniques, an
improved user interface, the ability to search grey-level images,
and a video storyboarding facility.An online demonstration,
together with information on how to download an evaluation copy
of the software, is available on the World-Wide Web

Virage : Another well-known commercial system is the VIR
Image Engine from Virage, Inc. This is available as a series of
independent modules, which systems developers can build in to
their own programs. This makes it easy to extend the system by
building in new types of query interface, or additional customized
modules to process specialized collections of images such as
trademarks. Alternatively, the system is available as an add-on to
existing database management systems such as Oracle or
Informix. An on-line demonstration of the VIR Image Engine can
be found at http://www.virage.com/online/. A high-profile
application of Virage technology is AltaVistas AV Photo Finder
(http://image.altavista.com/cgi-bin/avncgi), allowing Web
surfers to search for images by content similarity. Virage
technology has also been extended to the management of video
data; details of their commercial Videologger product can be
found on the Web at
http://www.virage.com/market/cataloger.html.

Excalibur : A similar philosophy has been adopted by Excalibur
Technologies, a company with a long history of successful
database applications, for their Visual RetrievalW are product.
This product offers a variety of image indexing and matching
techniques based on the companys own proprietary pattern
recognition technology. It is marketed principally as an
applications development tool rather than as a standalone
retrieval package. Its best-known application is probably the
Yahoo! Image Surfer, allowing content-based retrieval of images
from the World-wide Web. Further information on Visual
RetrievalWare can be found at http://www.excalib.com/, and a
demonstration of the Yahoo! Image Surfer at
http://isurf.yahoo.com/. Excaliburs product range also includes
the video data management system Screening Room
(http://www.excalib.com/products/video/screen.html).
Experimental systems
A large number of experimental systems have been developed,
mainly by academic institutions, in order to demonstrate the
feasibility of new techniques. Many of these are available as
demonstration versions on the Web. Some of the best-known are
described below.
Photobook : The Photobook system from Massachusetts
Institute of Technology (MIT) has proved to be one of the most
influential of the early CBIR systems. Like the commercial
systems above, aims to characterize images for retrieval by
computing shape, texture and other appropriate features. Unlike
these systems, however, it aims to calculate information-
preserving features, from which all essential aspects of the
original image can in theory be reconstructed. This allows features
relevant to a particular type of search to be computed at search
time, giving greater flexibility at the expense of speed. The system
has been successfully used in a number of applications, involving
retrieval of image textures, shapes, and human faces, each using
features based on a different model of the image. More recent
versions of the system allow users to select the most appropriate
feature type for the retrieval problem at hand from a wide range of
alternatives [Picard, 1996]. Further information on Photobook,
together with an online demonstration, can be found at
http://www-white.media.mit.edu/vismod/demos/photobook/.
Although Photobook itself never became a commercial product,
its face recognition technology has been incorporated into the
FaceID package from Viisage Technology
(http://www.viisage.com), now in use by several US police
departments.

Chabot: Another early system which has received wide publicity
is Chabot, which provided a combination of text-based and
colour-based access to a collection of digitized photographs held
by Californias Department of Water Resources. The system has
now been renamed Cypress, and incorporated within the Berkeley
Digital Library project at the University of California at Berkeley
(UCB). A demonstration of the current version of Cypress (which
no longer appears to have CBIR capabilities) can be found at
http://elib.cs.berkeley.edu/cypress.html. Rather more impressive
is UCBs recently-developed Blobworld software, incorporating
sophisticated colour region searching facilities
(http://elib.cs.berkeley.edu/photos/blobworld/).

VisualSEEk : The VisualSEEk system [Smith and Chang, 1997a]
is the first of a whole family of experimental systems developed at
Columbia University, New York. It offers searching by image
region colour, shape and spatial location, as well as by keyword.
Users can build up image queries by specifying areas of defined
shape and colour at absolute or relative locations within the
image. The WebSEEk system [Smith and Chang, 1997b] aims to
facilitate image searching on the Web. Web images are identified
and indexed by an autonomous agent, which assigns them to an
appropriate subject category according to associated text. Colour
histograms are also computed from each image. At search time,
users are invited to select categories of interest; the system then
displays a selection of images within this category, which users
can then search by colour similarity. Relevance feedback facilities
are also provided for search refinement. For a demonstration of
WebSEEk in action, see
http://disney.ctr.columbia.edu/WebSEEk/ Further prototypes
from this group include VideoQ , a video search engine allowing
users to specify motion queries, and MetaSEEk, a meta-search
engine for images on the Web.
MARS : The MARS project at the University of is aimed at
developing image retrieval systems which put the user firmly in
the driving seat. Relevance feedback is thus an integral part of the
system, as this is felt to be the only way at present of capturing
individual human similarity judgements. The system
characterizes each object within an image by a variety of features,
and uses a range of different similarity measures to compare
query and stored objects. User feedback is then used to adjust
feature weights, and if necessary to invoke different similarity
measures.
Informedia. In contrast to the systems described above, the
Informedia project was conceived as a multimedia video-based
project from the outset. Its overall aims are to allow full content
search and retrieval of video by integrating speech and image
processing. The system performs a number of functions. It
identifies video scenes (not just shots) from analysis of colour
histograms, motion vectors, speech and audio soundtracks, and
then automatically indexes these video paragraphs according to
significant words detected from the soundtrack, text from images
and captions, and objects detected within the video clips. A query
is typically submitted as speech input. Thumbnails of keyframes
are then displayed with the option to show a sentence describing
the content of each shot, extracted from spoken dialogue or
captions, or to play back the shot itself. Many of the systems
strengths stem from its extensive evaluation with a range of
different user populations. Its potential applications include TV
news archiving, sports, entertainment and other consumer videos,
and education and training. The Informedia website is at
http://informedia.cs.cmu.edu/; the Mediakey Digital Video
Library System from Islip Media, Inc, a commercially-available
system based on Informedia technology, is at
http://www.islip.com/fprod.htm.
Surfimage : An example of European CBIR technology is the
Surfimage system from INRIA, France. This has a similar
philosophy to the MARS system, using multiple types of image
feature which can be combined in different ways, and offering
sophisticated relevance feedback facilities. See http://www-
syntim.inria.fr/htbin/syntim/surfimage/surfimage.cgi for a
demonstration of Surfimage in action.
Netra : The Netra system uses colour texture, shape and spatial
location information to provide region-based searching based on
local image properties. An interesting feature is its use of
sophisticated image segmentation techniques. A Web
demonstration of Netra is available at
http://vivaldi.ece.ucsb.edu/Netra.
Synapse : This system is an implementation of retrieval by
appearance using whole image matching . A demonstration of
Synapse in action with a variety of different image types can be
found at http://cowarie.cs.umass.edu/~demo/.
Problem Statement With Proposed
Solution:
Problem Statement : Implementing CBIR for image retireval
problem to bring efficiency in the process
Proposed Solution:
There is a need for CBIR
Image retrieval does not entail solving the general image
understanding problem. It may be sufficient that a retrieval
system present similar images, similar in some user-defined
sense.
Interaction
The need for database
The semantic gap



Conclusion


This report presented an approach to contextualize image
queries, which is able to effectively represent complex semantic
concepts by means of the notion of image context. Although this is
simple, it is indeed effective and does not require neither a prior
classification of the image database, nor the analysis of
surrounding text (e.g. image caption, text of Web page including
the image, etc.), which might not always be available with an
image. Furthermore, our approach easily complements available
relevance feedback techniques, representing a good starting
point for interactive searches, and helps increasing both the
effectiveness and efficiency of further rounds of retrieval.















References

Aigrain, P et al (1996) Content-based representation and retrieval of visual media a state-of-the-art
review Multimedia Tools and Applications 3(3), 179-202
Alsuth, P et al (1998) On video retrieval: content analysis by ImageMiner in Storage and Retrieval for
Image and Video Databases VI, Proc SPIE 3312, 236-247
Androutsas, D et al (1998) Image retrieval using directional detail histograms in Storage and Retrieval
for Image and Video Databases VI, Proc SPIE 3312, 129-137
Ardizzone, E and La Cascia, M (1997) Automatic video database indexing and retrieval Multimedia
Tools and Applications 4, 29-56
Dr. Fuhui Long, Dr. Hongjiang Zhang and Prof. David Dagan Feng. Content Based Image Retrieval.
Dr. Fuhui Long, Dr. Hongjiang Zhang and Prof. David Dagan Feng . An Effective Region-Based Image Retrieval
Framework
Yossi Rubner, Carlo Tomasi, and Leonidas J. Guibas The Earth Mover's Distance as a Metric for Image Retrieval

Potrebbero piacerti anche