Sei sulla pagina 1di 7

Multimedia Data Mining in Digital Libraries:

Standards and Features

Sanjeevkumar R. Jadhav*, and Praveenkumar Kumbargoudar*

Abstract
The digital library retrieves, collects, stores and preserves the digital data. For this purpose,
there is need to convert different formats of information such as text, images, video, audio, etc. The data mining
techniques are popular while conversion of the multimedia files in the libraries. The present paper attempted to
define the term data mining. It also covered different data mining features and standards. The paper explained
about the Architecture of data mining, which contains the stages of the data mining such as (1) domain
understanding; (2) data selection; (3) cleaning and preprocessing; (4) discovering patters; (5) interpretation;
and (6) reporting and using discovered knowledge. It is emphasized that there is need to develop multimedia
data mining techniques and standards in the library for conversion of multimedia information.

1. INTRODUCTION

Over the past few decades, rapid changes in information technology have drastically
changed the functions and activities of the libraries. The Information and Communication
Technology created a new type of work culture, new forms of information storage, and new
means of communication and dissemination of information. The advent of electronic
resources and their increased use in libraries has brought about significant changes in Storage
and Communication of Information.
As a Result, the Conventional libraries are transforming into digital libraries.
Majority of the libraries have computerized already and digitizing their printed collection. In
India, the process of digitization is slow compared to other developed countries. This is so
because, only 21% of the Indian population is computer literate and only 14% of the Indian
Population is using Internet. Due to the development in digitization, many of the libraries
are digitizing their collection by transforming their printed materials into digital form.
A fully developed digital library environment involves the following elements1:
1. Initial Conversion of Content from Physical to Digital form.
2. The extraction or creation of metadata or indexing information describing the content
to facilitate searching and discovery, as well as administrative and structural metadata
to assist in object viewing, management and preservation.
3. Storage of digital content and metadata in appropriate multimedia repository. The
repository will include rights management capabilities to enforce Intellectual Property
Rights, if required. e-commerce functionality may also be present if needed to handle
accounting and billing.
4. Client Services for the browser, including repository querying and workflow.
5. Content delivery via file transfer or streaming media.
6. Patron access through a browser or dedicated client.

*
Gulbarga University, GULBARGA: 585 106. Karnataka. E-Mail: kumbargoudar@rediffmail.com

54
7. A private or public network.

2. DIGITIZATION AND DATA MINING

Digitization refers to the conversion of an item – be it printed text, manuscript, image


or sound, film and video recording – from one format (usually print or analogue) into digital.
The process basically involves taking a physical object and essentially making an ‘electronic
photograph’ of it. An image of the physical object is captured- using a scanner or digital
camera – and converted to digital format that can be stored electronically and accessed via a
computer2.
It is noted that the data and information available in different formats. These formats
include Text, Images, Video, Audio, Picture, Maps, etc. It is noted that in case of text
information, there is needed to scan the printed text through scanners and provide different
links to access it. But in case of multimedia formats like images, Audio, Picture, Maps,
Video etc, the conversion and systematic presentation is not easy. Further, there is needed to
make automatic search for easy accessibility. The easy search, effective and systematic
presentation of the data is essential in case of multimedia information. For this purpose, there
is need to adopt data mining techniques in the library. Data mining techniques are basically
from logic, Multimedia and Artificial Intelligence techniques.
Data mining is the automatic extraction of patterns of information from historical
data, enabling companies to focus on the next important aspects of their business—telling
them what they did not know and had not even thought of asking3. Data mining is that it “is
the process of automating information discovery”4, which improves decision making and
gives a company advantages on the market. Another definition is that is “is the exploration
and analysis, by automatic or semiautomatic means, of large quantities of data in order to
discover meaningful patterns and rules: 5 Data mining is an applied discipline, which grew
our of the statistical pattern recognition, machine learning, and artificial intelligence and
coupled with business decision making to optimize and enhance it. Initially, data mining
techniques have been applied to structured data from databases.
Recently two branches of data mining, text data mining and Web data mining, have
emerged6&7. They have their own research agenda, communities of researchers, and
supporting companies that develop technologies and tools. Unfortunately, today multimedia
data mining is in beginning stage and still there is need for developments to make effective
presentation of multimedia information.
There are four types of multimedia data: audio data, which includes sound , speech,
and music; image data (black-and-white and colour images); video data, which include time-
aligned sequences of images; and electronic or digital, which is sequences of time aligned 2D
or 3D coordinates of a stylus, a light per, data glove sensors, or a similar device. All this data
is generated by specific kind of sensors.
The concept of mining in multimedia is also referred to as automatic annotation or
annotation mining. There appears to be three main pattern discovery approaches that have
been used for automatic annotation in multimedia data mining. These approaches primarily
differ in terms of how external knowledge is provided to mine concepts. The first approach
includes assigning key words or classifying the data. The second approach for automatic
annotation is through clustering and here multimedia documents are clustered first and then
the resulting clusters are assigned keywords by annotator. The third approach does not rely
on manual annotator and it tries to mine concepts by knowing the contextual information.

55
The Multimedia Data Mining (MDM) is a part of multimedia technology, which
covers the following areas8.
¾ Media compression and storage.
¾ Delivering streaming media over networks with required quality of service.
¾ Media restoration, transformation, and editing.
¾ Media indexing, summarization, search, and retrieval.
¾ Creating interactive multimedia systems for learning/training and creative art
production.
¾ Creating multimodal user interfaces.

3. MULTIMEDIA DATA MINING ARCHITECTURE

The data mining process consists of several processes and stages, which are related to
each other and interactive. The main stages of the data mining process are (1) domain
understanding; (2) data selection; (3) cleaning and preprocessing; (4) discovering patters; (5)
interpretation; and (6) reporting and using discovered knowledge. The domain understanding
stage requires learning how the results of data-mining will be used so as to gather all relevant
prior knowledge before mining9.

Figure: Multimedia Data Mining Architecture

The data selection stage requires the user to target a database or select a subset of
fields or data records to be used for data mining. A proper domain understands at this stage

56
helps in the identification of useful data. This is the most time consuming stage of the entire
data mining process for business applications; data are never clean and in the form suitable
for data mining. For multimedia data mining, this stage is generally not an issue, because the
data are not in relational form and there are no subsets of fields to choose from.
The next stage in a typical data mining process is the preprocessing step that involves
integrating data from different sources and making choices about representing or coding
certain data fields that serve as inputs to the pattern discovery stage. Such representation
choices are needed because certain fields may contain data at levels of details not considered
suitable for the pattern discovery stage. The preprocessing stage is of considerable
importance in multimedia data mining, given the unstructured nature of multimedia data.
The pattern discovery stage is the heart of the entire data mining process. It is the
stage where the hidden patterns and trends in the data are actually uncovered. There are
several approaches to the pattern discovery stage. These include association, classification,
clustering, regression, time-series analysis and visualization. Each of these approaches can
be implemented through one of several competing methodologies, such as statistical data
analysis, machine learning, neural networks and pattern recognition. It is because of the use
of methodologies from several disciplines that data mining is often viewed as a
multidisciplinary field.
The interpretation stage of the data mining process is used to evaluate the quality of
discovery and its value to determine whether previous stage should be revisited or not.
Proper domain understanding is crucial at this stage to put a value on discovered patterns.
The final stage of the data mining process consists of reporting and putting to use the
discovered knowledge to generate new actions or products and services or marketing
strategies as the case may be.
According to Myatt10 any exploratory data mining project should include the
following steps:
1. Problem Definition: The problem to be solved along with the projected deliverables
(information products) should be clearly defined, an appropriate team should be put
together, and a plan generated for executing the analysis.
2. Data Preparation: Prior to starting any data analysis or data mining project, the data
should be collected characterized, cleaned, transformed, and partitioned into an
appropriate form for processing further.
3. Implementation of the Analysis: On the basis of the information from steps 1 & 2,
appropriate analysis techniques should be selected and often these methods need to be
optimized.
4. Deployment of Results: The Results from Step 3 should be communicated and/ or
deployed into a pre-existing process.

4. FEATURES AND STANDARDS FOR MULTIMEDIA DATA MINING

It is noted that different image attributes such as Colour, edges, shape, and texture are
used to extract features for mining. Feature extraction based on these attributes may be

57
performed at the global or local level. For example, colour histograms may be used as
features to characterize the spatial distribution of colour in an image. Similarly, the shape of
a segmented region may be represented as a feature vector of Fourier descriptors to capture
global shape property of the segmented region or a shape could be described in terms of
salient points or segments to provide localized descriptions. Global descriptors are generally
easy to compute, provide a compact representation, and are less prone to segmentation errors.
However such descriptors may fail to uncover subtle patterns or changes in shape because
global descriptors tend to integrate the underlying information. Local descriptors, on the
other hand, tend to do generate more elaborate representation and can yield useful results
even when part of the underlying attribute, for example, the shape of a region is occluded, is
missing. In the case of video, additional attributes resulting from object and camera motion
are used.
In case of audio, both the temporal and the spectral domain features have been
employed. Examples of some of the features used include short-time energy, pause rate,
zero-crossing rate, normalized harmonicity, fundamental frequency, frequency spectrum,
bandwidth, spectral centroid, spectral roll-off frequency and band energy ratio. Many
researchers have found the cepstral based features, Mel-Frequency Cepstral Coefficients
(MFCC) and Linear Predictive Coefficients (LPC), very useful, especially in mining tasks
involving speech recognition. The MPEG-7 standard provides a good representative set of
features for multimedia data. The features are referred as descriptors in MPEG-7. The
MPEG-7 Visual description tools describe visual data such as images and videos while the
Audio description tools account for audio data. The MPEG-7 visual description defines the
following main features for color attributes: Color Layout Descriptor, Color Structure
Descriptor, Dominant Color Descriptor and Scalable Color Descriptor. The Color Layout
Descriptor is a compact and resolution invariant descriptor that is defined as YCbCr Color
space to capture the spatial distribution of color over major image regions. The Color
Structure Descriptor captures both color content and information about its spatial
arrangement using a structuring element that is moved over the image. The Dominant Color
Descriptor characterizes an image or an arbitrarily shaped region by a small number of
representative colors. The Scalable Color Descriptor is a color histogram in the HSV Color
Space encoded by Haar transform to yield a scalable representation. While the above
features are defined with respect to an image or its part, the feature Group of Frames-Group
of Pictures Color (GoFGoPColor) describes the color histogram aggregated over multiple
frames of a video9.
MPEG-7 provides for two main shape descriptors; others are based on these and
additional semantic information. The Region shape Descriptor describers the shape of a
region using Angular Radial Transform (ART). The description is provided in terms of 40
coefficients and is suitable for complex objects consisting of multiple disconnected regions
and for simple objects with or without holes. The Contour Shape Descriptor describes the
shape of an object based on its outlines. The descriptor used the curvature scale space
representation of the contour.
The motion descriptors in MPEG-7 are defined to cover a broad range of applications.
The motion activity descriptor captures the intuitive notion of intensity or pace of action in a
video clip. The descriptor provides information for intensity, direction, and spatial and
temporal distribution of activity in a video segment. The spatial distribution of activity
indicates whether the activity is spatially limited or not. Similarly, the temporal distribution
of activity indicates how the level of activity varies over the entire segment. The Camera
Motion Descriptor specifies the camera motion types and their quantitative characterization
over the entire video segment. The Motion Trajectory Descriptor describes motion trajectory
58
of moving object basic on spatiotemporal localization of trajectory points. The description
provided is at a fairly high level as each moving object is indicated by one representative
point at any time instant. The parametric Motion Descriptors describes motion, global and
object motion, in a bideo segment by describing the evolution of arbitrarily shaped regions
over time using a two-dimensional geometric transform.
The MPEG-7 Audio standard defines two sets of audio descriptors. The first set is of
low-level features, which are meant for a wide range of applications. The descriptors in this
set include silence, power, Spectrum, and Harmonicity. The silence Descriptor simply
indicates that there is no significant sound in the audio segment. The power Descriptor
measures temporally smoothed instantaneous signal power. The Spectrum Descriptor
captures properties such as the audio spectrum envelope, spectrum centroid spectrum spread,
spectrum flatness, and fundamental frequency. The second set of audio descriptors is of
high-level feature, which are meant for specific applications. The features in this set include
Audio Signature, Timbre, and Melody. The Signature Descriptor is designed to generate a
unique identifier for identifying audio content. The Timbre Descriptor captures perceptual
features of instrument sound. The Melody Descriptor captures monophonic melodic
information and is useful for matching of melodies. In addition, the high-level descriptors in
MPEG-7 Audio include descriptors for automatic speech recognition, sound classification
and indexing.

5. MULTIMEDIA DATA MINING IN DIGITAL LIBRARIES:

Quan Liu11 suggested the ‘Standards and guidelines associated with library
digitization practices vary from project to project. Over the years, university, public, school,
and special libraries have adopted their own policies with regard to digitization. Some older
standards, as well as more recent ones, are widely accepted and practiced library digitization
projects. Metadata standards and image quality standards and guidelines are commonly
sought when planning digitization projects… Common metadata standards used to date are
Dublin Core, RDF, EAD, TEI, and SGML and its descendents XML and HTML. The MARC
standard has been used as the standard interchange format in representing catalog records
electronically’.
It is noted that in India, only a few University and College libraries have already
started digitization and a majority of the University and College libraries are yet to start the
work of digitization and conversion work of their collection. Further, it is noted that the
experts in library science and information science, to large extent only provided guidelines
for conversion of text documents. Hence, there is need to know about the standards and
processes of the data mining and storage of multimedia data through data mining techniques.

6. CONCLUSION

Multimedia data mining techniques are active and growing area of research now. In
case of digital library projects, there is need for multimedia data mining for conversion and
preservation of multimedia information. There is needed to make data mining strategy for
conversion of multimedia files in the libraries. The digital libraries, to a large extent
accessible through the web, must present multimedia information effectively. Then the
purpose of these libraries is served properly. To serve this purpose, there is needed to form
data mining strategy, considering standards, features and available techniques.

59
REFERENCES

1. Sinha, Manojkumar and Others: Digital Library Initiatives in India for Open Access: An
Overview. 4th International CALIBER 2006. Gulbarga: Gulbarga University, 2-4 February
2006. P. 149-164.
2. Parekh, Harsha and Sen, Bharati: Introduction to Digitization: a Librarian’s Guide. Mumbai:
SHPT School of Library Science, SNDT Women’s University, 2001. P. 8.
3. Bharihoke, Deepak: Fundamentals of Information Technology. 3rd Ed. New Delhi: Excel
books, 2005.
4. Groth, R: Data Mining: A Hands On Approach for Business Professionals. Upper Saddle
River, New Jersey: Prentice Hall, 1998.
5. Berry, MJA and Linoff, G: Data Mining Techniques for Marketing, Sales and Customer
Support. New York: Wiley Computer Publishing, 1997.
6. Agosta, L: The future of data mining- Predictive Analytics. IT View Report, Forrester
Research, http://www.forrester.com/Research/LegacyIT/0,7208,32896,00.html accessed on
30th April 2007.
7. Berry, MW, Ed: Survey of Text Mining: Clustering, Classification and Retrieval. New York:
Springer-Verlag, 2004.
8. Petrushin, Valery A: Introduction into Multimedia Data Mining and Knowledge Discovery.
IN Multimedia Data Mining and Knowledge Discovery. Edited by Valery A Petrushin and
Latifur Khan. London: Springer-Verlag, 2007. P. 3-13.
9. Patel, Nilesh and Sethi, Ishwar: Multimedia Data Mining: An Overview. IN Multimedia Data
Mining and Knowledge Discovery. Edited by Valery A Petrushin and Latifur Khan. London:
Springer-Verlag, 2007. P. 14-40.
10. Myatt, Glenn J: Making Sense of Data: A Practical Guide to Exploratory Data Analysis and
Data Mining. New Jersey: John Wiley & Sons, 2007.
11. Quan Liu, Yan: Best practices, standards and techniques for digitizing library materials: a
snapshot of library digitization practices in the USA. Online Information Review. Vol. 28.
No. 5. 2004. P. 338-345.

60

Potrebbero piacerti anche