0%(1)Il 0% ha trovato utile questo documento (1 voto)
176 visualizzazioni27 pagine
This document is a report submitted by Sonia Bhagwat in partial fulfillment of the requirements for a Master of Computer Applications degree. The report discusses content-based image retrieval (CBIR), which involves retrieving images from a database based on the visual content of images rather than metadata. It provides an overview of the history of image retrieval, CBIR techniques, applications of CBIR, and existing CBIR systems. The report was completed under the guidance of Prof. Ratna Karmakar and submitted to the Department of Computer Technology at Veermata Jijabai Technological Institute.
This document is a report submitted by Sonia Bhagwat in partial fulfillment of the requirements for a Master of Computer Applications degree. The report discusses content-based image retrieval (CBIR), which involves retrieving images from a database based on the visual content of images rather than metadata. It provides an overview of the history of image retrieval, CBIR techniques, applications of CBIR, and existing CBIR systems. The report was completed under the guidance of Prof. Ratna Karmakar and submitted to the Department of Computer Technology at Veermata Jijabai Technological Institute.
This document is a report submitted by Sonia Bhagwat in partial fulfillment of the requirements for a Master of Computer Applications degree. The report discusses content-based image retrieval (CBIR), which involves retrieving images from a database based on the visual content of images rather than metadata. It provides an overview of the history of image retrieval, CBIR techniques, applications of CBIR, and existing CBIR systems. The report was completed under the guidance of Prof. Ratna Karmakar and submitted to the Department of Computer Technology at Veermata Jijabai Technological Institute.
Submitted in partial fulfillment of the requirements for the Degree of
Master of Computer Applications (MCA)
Sonia Bhagwat Roll No : 122011041 Under the guidance of Prof. Ratna Karmakar
Department Of Computer Technology Veermata Jijabai Technological Institute (Autonomous Institute, Affiliated To University of Mumbai) Mumbai 400019 Year 2013-2014 VEERMATA JIJABAI TECHNOLOGICAL INSTITUTE MATUNGA, MUMBAI 400019
CERTIFICATE
This is to certify that the seminar report titled
Picture Based Query Processing has been completed successfully By
Sonia Bhagwat Roll No. 122011041 Class: MCA-IV in Academic year 2013-2014
Evaluator : Prof. Ratna Karmakar
Date : 7th May,2014
INDEX
1. Introduction 2. History of Image Retrieval 3. Content Based Image Retrieval(CBIR) 4. Applications of CBIR 5. CBIR Techniques 5.1 Query Techniques 5.2 Semantic Retrieval 5.3 Other Query Techniques 5.4 Color 5.5 Texture 5.6 Shape 6. CBIR Architecture 7. Sample Query 8. Feature Extraction 9. Existing CBIR Systems 10. Problem Statement With Proposed Solution 11. Conclusion 12. References
Introduction
This is a report based on picture based query processing i.e. the retrieval of images from database based on the content of images rather than the metadata such as text associated with the images. The method used for this is Content Based Image Retrieval(CBIR). Very large collections of images are growing ever more common. From stock photo collections to proprietary databases to the Web, these collections are diverse and often poorly indexed; unfortunately, image retrieval systems have not kept pace with the collections they are searching. The shortcomings of these systems are due both to the image representations they use and to their methods of accessing those representations to find images.
History of Image Retrieval
An image retrieval system is a computer system for browsing, searching and retrieving images from a large database of digital images. Most traditional and common methods of image retrieval utilize some method of adding metadata such as captioning , keywords, or descriptions to the images so that retrieval can be performed over the annotation words. Manual image annotation is time-consuming, laborious and expensive; to address this, there has been a large amount of research done on automatic image annotation. Additionally, the increase in social web applications and the semantic web have inspired the development of several web-based image annotation tools.
The traditional paradigm for the retrieving of images is based on keyword annotation. In this approach, human experts manually annotate each image with a textual description, so that text-based information retrieval techniques can be applied. This approach has the advantage of inheriting efficient technologies developed for text retrieval, but is clearly impracticable for the case of very large image DBs. Moreover, its effectiveness highly depends on the subjective opinions of the annotators, who are also likely to supply different descriptions for the same image.
Content Based Image Retrieval
Content-based image retrieval (CBIR), also known as query by image content (QBIC) and content-based visual information retrieval (CBVIR) is the application of computer vision techniques to the image retrieval problem, that is, the problem of searching for digital images in large databases Content-based image retrieval is opposed to concept-based approaches.
"Content-based" means that the search analyzes the contents of the image rather than the metadata such as keywords, tags, or descriptions associated with the image. The term "content" in this context might refer to colors, shapes, textures, or any other information that can be derived from the image itself. CBIR is desirable because most web-based image search engines rely purely on metadata and this produces a lot of garbage in the results. Also having humans manually enter keywords for images in a large database can be inefficient, expensive and may not capture every keyword that describes the image. Thus a system that can filter images based on their content would provide better indexing and return more accurate results.
The term "content-based image retrieval" seems to have originated in 1992 when it was used by T. Kato to describe experiments into automatic retrieval of images from a database, based on the colors and shapes present. Since then, the term has been used to describe the process of retrieving desired images from a large collection on the basis of syntactical image features. The techniques, tools, and algorithms that are used originate from fields such as statistics, pattern recognition, signal processing, and computer vision.
Applications of CBIR
There is a growing interest in CBIR because of the limitations inherent in metadata-based systems, as well as the large range of possible uses for efficient image retrieval. Textual information about images can be easily searched using existing technology, but this requires humans to manually describe each image in the database. This is impractical for very large databases or for images that are generated automatically, e.g. those from surveillance cameras. It is also possible to miss images that use different synonyms in their descriptions. Systems based on categorizing images in semantic classes like "cat" as a subclass of "animal" avoid this problem but still face the same scaling issues. Potential uses for CBIR include: Architectural and engineering design Art collections Crime prevention Geographical information and remote sensing systems Intellectual property Medical diagnosis Military Photograph archives Retail catalogs
CBIR Techniques Many CBIR systems have been developed, but the problem of retrieving images on the basis of their pixel content remains largely unsolved. Query techniques: Different implementations of CBIR make use of different types of user queries. Query by example is a query technique that involves providing the CBIR system with an example image that it will then base its search upon. The underlying search algorithms may vary depending on the application, but result images should all share common elements with the provided example. Options for providing example images to the system include: A preexisting image may be supplied by the user or chosen from a random set. The user draws a rough approximation of the image they are looking for, for example with blobs of color or general shapes. [3]
This query technique removes the difficulties that can arise when trying to describe images with words. Semantic retrieval: The ideal CBIR system from a user perspective would involve what is referred to as semantic retrieval, where the user makes a request like "find pictures of Abraham Lincoln". This type of open-ended task is very difficult for computers to perform - pictures of chihuahuas and Great Danes look very different, and Lincoln may not always be facing the camera or in the same pose. Current CBIR systems therefore generally make use of lower-level features like texture, color, and shape, although some systems take advantage of very common higher-level features like faces (facial recognition system). Not every CBIR system is generic. Some systems are designed for a specific domain, e.g. shape matching can be used for finding parts inside a CAD- CAM database. Other query methods: Other query methods include browsing for example images, navigating customized/hierarchical categories, querying by image region (rather than the entire image), querying by multiple example images, querying by visual sketch, querying by direct specification of image features, and multimodal queries (e.g. combining touch, voice, etc.) CBIR systems can also make use of relevance feedback, where the user progressively refines the search results by marking images in the results as "relevant", "not relevant", or "neutral" to the search query, then repeating the search with the new information. Content comparison using image distance measures: The most common method for comparing two images in content-based image retrieval (typically an example image and an image from the database) is using an image distance measure. An image distance measure compares the similarity of two images in various dimensions such as color, texture, shape, and others. For example a distance of 0 signifies an exact match with the query, with respect to the dimensions that were considered. As one may intuitively gather, a value greater than 0 indicates various degrees of similarities between the images. Search results then can be sorted based on their distance to the queried image. Color: Computing distance measures based on color similarity is achieved by computing a color histogram for each image that identifies the proportion of pixels within an image holding specific values (that humans express as colors). Current research is attempting to segment color proportion by region and by spatial relationship among several color regions. Examining images based on the colors they contain is one of the most widely used techniques because it does not depend on image size or orientation. Color searches will usually involve comparing color histograms, though this is not the only technique in practice. Texture: Texture measures look for visual patterns in images and how they are spatially defined. Textures are represented by texels which are then placed into a number of sets, depending on how many textures are detected in the image. These sets not only define the texture, but also where in the image the texture is located. Texture is a difficult concept to represent. The identification of specific textures in an image is achieved primarily by modeling texture as a two-dimensional gray level variation. The relative brightness of pairs of pixels is computed such that degree of contrast, regularity, coarseness and directionality may be estimated. However, the problem is in identifying patterns of co- pixel variation and associating them with particular classes of textures such as silky, or rough.
Shape: Shape does not refer to the shape of an image but to the shape of a particular region that is being sought out. Shapes will often be determined first applying segmentation or edge detection to an image. Other methods like use shape filters to identify given shapes of an image. In some case accurate shape detection will require human intervention because methods like segmentation are very difficult to completely automate
CBIR System
Sample Query
CBIR Architecture
Suppose user wants to search for, say, many rose images o He submits an existing rose picture as query. o He submits his own sketch of rose as query. The system will extract image features for this query. It will compare these features with that of other images in a database. Relevant results will be displayed to the user.
Feature Extraction
Primitive features Mean color (RGB) Color Histogram Semantic features Color Layout, texture etc Domain specific features Face recognition, fingerprint matching etc
Mean Colour :
Pixel Color Information: R, G, B Mean component (R,G or B)= Sum of that component for all pixels /Number of pixels
Colour Histogram:
Frequency count of each individual color Most commonly used color feature representation
Colour Layout:
Need for Color Layout Global color features give too many false positives How it works: Divide whole image into sub-blocks Extract features from each sub-block Can we go one step further? Divide into regions based on color feature concentration This process is called segmentation.
Texture:
Texture innate property of all surfaces Clouds, trees, bricks, hair,etc Refers to visual patterns of homogeneity Does not result from presence of single color Most accepted classification of textures based on psychology studies Tamura representation as follows : Coarseness Contrast Directionality Linelikeness Regularity Roughness
Existing CBIR Systems
Available CBIR software Despite the shortcomings of current CBIR technology, several image retrieval systems are now available as commercial packages, with demonstration versions of many others available on the Web. Some of the most prominent of these are described below. Commercial systems QBIR : IBMs QBIC system is probably the best-known of all image content retrieval systems. It is available commercially either in standalone form, or as part of other IBM products such as the DB2 Digital Library. It offers retrieval by any combination of colour, texture or shape as well as by text keyword. Image queries can be formulated by selection from a palette, specifying an example query image, or sketching a desired shape on the screen. The system extracts and stores colour, shape and texture features from each image added to the database, and uses R*-tree indexes to improve search efficiency [Faloutsos et al, 1994]. At search time, the system matches appropriate features from query and stored images, calculates a similarity score between the query and each stored image examined, and displays the most similar images on the screen as thumbnails. The latest version of the system incorporates more efficient indexing techniques, an improved user interface, the ability to search grey-level images, and a video storyboarding facility.An online demonstration, together with information on how to download an evaluation copy of the software, is available on the World-Wide Web
Virage : Another well-known commercial system is the VIR Image Engine from Virage, Inc. This is available as a series of independent modules, which systems developers can build in to their own programs. This makes it easy to extend the system by building in new types of query interface, or additional customized modules to process specialized collections of images such as trademarks. Alternatively, the system is available as an add-on to existing database management systems such as Oracle or Informix. An on-line demonstration of the VIR Image Engine can be found at http://www.virage.com/online/. A high-profile application of Virage technology is AltaVistas AV Photo Finder (http://image.altavista.com/cgi-bin/avncgi), allowing Web surfers to search for images by content similarity. Virage technology has also been extended to the management of video data; details of their commercial Videologger product can be found on the Web at http://www.virage.com/market/cataloger.html.
Excalibur : A similar philosophy has been adopted by Excalibur Technologies, a company with a long history of successful database applications, for their Visual RetrievalW are product. This product offers a variety of image indexing and matching techniques based on the companys own proprietary pattern recognition technology. It is marketed principally as an applications development tool rather than as a standalone retrieval package. Its best-known application is probably the Yahoo! Image Surfer, allowing content-based retrieval of images from the World-wide Web. Further information on Visual RetrievalWare can be found at http://www.excalib.com/, and a demonstration of the Yahoo! Image Surfer at http://isurf.yahoo.com/. Excaliburs product range also includes the video data management system Screening Room (http://www.excalib.com/products/video/screen.html). Experimental systems A large number of experimental systems have been developed, mainly by academic institutions, in order to demonstrate the feasibility of new techniques. Many of these are available as demonstration versions on the Web. Some of the best-known are described below. Photobook : The Photobook system from Massachusetts Institute of Technology (MIT) has proved to be one of the most influential of the early CBIR systems. Like the commercial systems above, aims to characterize images for retrieval by computing shape, texture and other appropriate features. Unlike these systems, however, it aims to calculate information- preserving features, from which all essential aspects of the original image can in theory be reconstructed. This allows features relevant to a particular type of search to be computed at search time, giving greater flexibility at the expense of speed. The system has been successfully used in a number of applications, involving retrieval of image textures, shapes, and human faces, each using features based on a different model of the image. More recent versions of the system allow users to select the most appropriate feature type for the retrieval problem at hand from a wide range of alternatives [Picard, 1996]. Further information on Photobook, together with an online demonstration, can be found at http://www-white.media.mit.edu/vismod/demos/photobook/. Although Photobook itself never became a commercial product, its face recognition technology has been incorporated into the FaceID package from Viisage Technology (http://www.viisage.com), now in use by several US police departments.
Chabot: Another early system which has received wide publicity is Chabot, which provided a combination of text-based and colour-based access to a collection of digitized photographs held by Californias Department of Water Resources. The system has now been renamed Cypress, and incorporated within the Berkeley Digital Library project at the University of California at Berkeley (UCB). A demonstration of the current version of Cypress (which no longer appears to have CBIR capabilities) can be found at http://elib.cs.berkeley.edu/cypress.html. Rather more impressive is UCBs recently-developed Blobworld software, incorporating sophisticated colour region searching facilities (http://elib.cs.berkeley.edu/photos/blobworld/).
VisualSEEk : The VisualSEEk system [Smith and Chang, 1997a] is the first of a whole family of experimental systems developed at Columbia University, New York. It offers searching by image region colour, shape and spatial location, as well as by keyword. Users can build up image queries by specifying areas of defined shape and colour at absolute or relative locations within the image. The WebSEEk system [Smith and Chang, 1997b] aims to facilitate image searching on the Web. Web images are identified and indexed by an autonomous agent, which assigns them to an appropriate subject category according to associated text. Colour histograms are also computed from each image. At search time, users are invited to select categories of interest; the system then displays a selection of images within this category, which users can then search by colour similarity. Relevance feedback facilities are also provided for search refinement. For a demonstration of WebSEEk in action, see http://disney.ctr.columbia.edu/WebSEEk/ Further prototypes from this group include VideoQ , a video search engine allowing users to specify motion queries, and MetaSEEk, a meta-search engine for images on the Web. MARS : The MARS project at the University of is aimed at developing image retrieval systems which put the user firmly in the driving seat. Relevance feedback is thus an integral part of the system, as this is felt to be the only way at present of capturing individual human similarity judgements. The system characterizes each object within an image by a variety of features, and uses a range of different similarity measures to compare query and stored objects. User feedback is then used to adjust feature weights, and if necessary to invoke different similarity measures. Informedia. In contrast to the systems described above, the Informedia project was conceived as a multimedia video-based project from the outset. Its overall aims are to allow full content search and retrieval of video by integrating speech and image processing. The system performs a number of functions. It identifies video scenes (not just shots) from analysis of colour histograms, motion vectors, speech and audio soundtracks, and then automatically indexes these video paragraphs according to significant words detected from the soundtrack, text from images and captions, and objects detected within the video clips. A query is typically submitted as speech input. Thumbnails of keyframes are then displayed with the option to show a sentence describing the content of each shot, extracted from spoken dialogue or captions, or to play back the shot itself. Many of the systems strengths stem from its extensive evaluation with a range of different user populations. Its potential applications include TV news archiving, sports, entertainment and other consumer videos, and education and training. The Informedia website is at http://informedia.cs.cmu.edu/; the Mediakey Digital Video Library System from Islip Media, Inc, a commercially-available system based on Informedia technology, is at http://www.islip.com/fprod.htm. Surfimage : An example of European CBIR technology is the Surfimage system from INRIA, France. This has a similar philosophy to the MARS system, using multiple types of image feature which can be combined in different ways, and offering sophisticated relevance feedback facilities. See http://www- syntim.inria.fr/htbin/syntim/surfimage/surfimage.cgi for a demonstration of Surfimage in action. Netra : The Netra system uses colour texture, shape and spatial location information to provide region-based searching based on local image properties. An interesting feature is its use of sophisticated image segmentation techniques. A Web demonstration of Netra is available at http://vivaldi.ece.ucsb.edu/Netra. Synapse : This system is an implementation of retrieval by appearance using whole image matching . A demonstration of Synapse in action with a variety of different image types can be found at http://cowarie.cs.umass.edu/~demo/. Problem Statement With Proposed Solution: Problem Statement : Implementing CBIR for image retireval problem to bring efficiency in the process Proposed Solution: There is a need for CBIR Image retrieval does not entail solving the general image understanding problem. It may be sufficient that a retrieval system present similar images, similar in some user-defined sense. Interaction The need for database The semantic gap
Conclusion
This report presented an approach to contextualize image queries, which is able to effectively represent complex semantic concepts by means of the notion of image context. Although this is simple, it is indeed effective and does not require neither a prior classification of the image database, nor the analysis of surrounding text (e.g. image caption, text of Web page including the image, etc.), which might not always be available with an image. Furthermore, our approach easily complements available relevance feedback techniques, representing a good starting point for interactive searches, and helps increasing both the effectiveness and efficiency of further rounds of retrieval.
References
Aigrain, P et al (1996) Content-based representation and retrieval of visual media a state-of-the-art review Multimedia Tools and Applications 3(3), 179-202 Alsuth, P et al (1998) On video retrieval: content analysis by ImageMiner in Storage and Retrieval for Image and Video Databases VI, Proc SPIE 3312, 236-247 Androutsas, D et al (1998) Image retrieval using directional detail histograms in Storage and Retrieval for Image and Video Databases VI, Proc SPIE 3312, 129-137 Ardizzone, E and La Cascia, M (1997) Automatic video database indexing and retrieval Multimedia Tools and Applications 4, 29-56 Dr. Fuhui Long, Dr. Hongjiang Zhang and Prof. David Dagan Feng. Content Based Image Retrieval. Dr. Fuhui Long, Dr. Hongjiang Zhang and Prof. David Dagan Feng . An Effective Region-Based Image Retrieval Framework Yossi Rubner, Carlo Tomasi, and Leonidas J. Guibas The Earth Mover's Distance as a Metric for Image Retrieval