Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
Search Engines
Giuseppe Becchi, Marco Bertini,
Lorenzo Cioni, Alberto Del Bimbo,
Andrea Ferracani, Daniele Pezzatini
Mathias Lux
Klagenfurt University - ITEC
Klagenfurt, Austria
mlux@itec.aau.at
[name.surname]@unifi.it
ABSTRACT
In this paper we present Loki+Lire, a framework for the
creation of web-based interfaces for search, annotation and
presentation of multimedia data. The framework provides
tools to ingest, transcode, present, annotate and index different types of media such as images, videos, audio files and
textual documents. The front-end is compliant with the
latest HTML5 standards, while the back-end allows system
administrators to create processing pipelines that can be
adapted for different tasks and purposes.
The system has been developed in a modular way, aiming
at creating loosely coupled components, letting developers
to use it as a whole or to select only the parts that are needed
to develop their own tools and systems.
ments that may range from textual documents (e.g. presentations) to images, audio and videos; ii) the fact that these
media are consumed, distributed and presented through the
web. Under these circumstances it is necessary to develop
systems and components that are capable to handle diverse
media, accounting for their different presentation and for
how users interact with them. From the point of view of
content managers it is required to create different processing, annotating and indexing pipelines that let to provide
different types of services, such as keyword-based retrieval or
content-based multimedia retrieval. The system presented
in this paper caters for all these needs: the media presentation components, tailored for each type of media, allow
to browse, search and annotate, while the back-end components let to create processing pipelines that include ingestion, transcoding and indexing.
The framework can be used for different purposes and by
different users, such as:
Researchers who need an interface to demo their own
annotation systems, that can be added as processes in
the back-end;
General Terms
Algorithms, Design, Experimentation
Keywords
Semantic multimedia annotation, retrieval, content-based
multimedia retrieval, open source software
1.
INTRODUCTION
Two important trends of multimedia production and consumption of the latest years are i) the fact that anyone,
from end users to professionals, have become creators of every type of digital data, producing any type of media docu
Corresponding author
Authors listed in alphabetical order
2.
Permission to make digital or hard copies of all or part of this work for
personal or classroom use is granted without fee provided that copies are
not made or distributed for profit or commercial advantage and that copies
bear this notice and the full citation on the first page. To copy otherwise, to
republish, to post on servers or to redistribute to lists, requires prior specific
permission and/or a fee.
MM14, November 37, 2014, Orlando, Florida USA.
Copyright 20XX ACM X-XXXXX-XX-X/XX/XX ...$15.00.
THE SYSTEM
The system has been designed using a loosely coupled approach for all its components, that have been developed as
plugins for well-established frameworks like jQuery or tools
like Solr/Lucene. This approach lets users to deploy it as
a whole or to select only the components, either from the
front-end or the back-end, that are needed. In particular,
all the interface components of Loki are fully scriptable and
degli utenti
Ricerca annotazioni
in altri video
Figure 2: Video component: video with overlaid annotations and frames visually similar to the
one currently shown (left); search within the video
player: hovering on a result shows the corresponding keyframe in the player.(right).
15 di 19
2.1.2
Audio
The audio component (Fig. 3) is similar to the video component, and has the same properties to activate the functionalities required for the application in which it is embedded.
The main difference is that it shows the audio wave form,
Figure 1: Cross-media search interface built using
that can be used as a cue when browsing and annotating
the Loki+Lire framework.
10 di Il
19 componente per audio
the audio file. Since it is not yet possible to compute this
wave audio
form in realtime using JavaScript, it is automatically
Riproduzione
computed
ingestion and transcoding of the audio
Visualizzazione
formaduring
donda perthe
la navigazione
2.1 The Front-End
filesannotazioni
by a back-end service.
Aggiunta
The fronted components have been developed fully re Ricerca annotazioni
specting the HTML5 standard using HTML5 and CSS3 for
the presentation and JavaScript for the business logic. Two
JavaScript frameworks have been used to create snappier
and more interactive widgets: AngularJS and JQuery, comprising some extensions of the latter framework, such as
jQuery UI. In particular, using AngularJS it has been possible to develop the interface components using the ModelFigure 3: Audio component: audio file with overlaid
View-Controller design pattern. In MVC terms, the View
annotations (left); search within the audio player
corresponds to the HTML code, while the operations on the
(right).
model, that interact with the presentation and are part of
the Controller, are written in JavaScript.
Il componente per immagini
2.1.3 Image
The media components, developed as jQuery plugins
thus allowing their integration in other web-based systems
The image component (Fig. 4) has some differences w.r.t. authat use jQuery let to visualize media, annotate them (it
dio and video components, in that annotations are not shown
Visualizzazione immagine
is possible to activate this functionality depending on the
as overlays and there is no need to perform searches within
access level of a user), search within the media that is bethesame
media.
On the other hand it allows to annotate the
Aggiunta
annotazioni
ing shown. These widgets interact with the back-end using
whole image or just a portion of it. Hovering on localized
Totali: sullintera
Il componente
per
immagini
SOAP web services. All the components
have a large
numannotations
related bounding boxes are shown.
immagine
ber of properties that let to personalize their functionalities
Bounding-box: in unarea
and appearance.
17 di 19
2.1.1
Video
delimitata dellimmagine
Visualizzazione immagine
Aggiunta annotazioni
2.1.4
Document
per
Introduzione
Homepage dellapplicazione
widget main view and allow document navigation. Thanks
to SVG it is possible to easily zoom in and out of the page,
or move it while it is zoomed. Page thumbnails are used also
when showing the results of the intra-document search, that
behaves thus similarly to video search. User annotations
can be localized within specific portions of the page, thus
behaving similarly to the image component. Furthermore
the component allows collaboration. The user can choose to
visualise only his or other users annotations. Each authors
annotation is visualised with its own color and shows the
nickname of the user by who was added.
documenti
I filtri di ricerca
La ricerca pu`
o essere filtrata per paro
chiave, visualizzate nel quadro Filters
ocumenti in
e SVG
out di pagina
ca
9 di 19
pagine
re
Filtri
keyword
Cluster chiuso
Unper
cluster
pu`o trovarsi in due
I clusters di immagini
Filtri
per
keyword
Figure 7: Search facets: concepts (left); media type
possibili stati:
2.1.5
(right).
Chiuso:
immagini
cross-media search
engine le
(Fig.
1) is pro- al suo interno
A full-fledged
vided, so that it can be used asnon
an application
or as
vengonoitself
visualizzate
12 di 19
point molto
for developers
to start to use the
Lea starting
immagini
similiwhotrawant
loro
framework. This application uses all the components deAperto:
le immagini al suo interno
vengono
raggruppate
in clusters.
scribed above
and the back-end
components and function12 di 1
alities described in the next section.
vengono visualizzate allutente
The system starts letting users to search with a keyword
Cluster chiuso
Un
cluster apu`
o trovarsi
due
or uploading
media
from theirinPC
(Fig. 6). Users that log
Figure 8: Results cluster:Cluster
visualization
of a closed
aperto
in
the
system
can
create
collections
by
selecting
the
search
possibili stati:
cluster of images/keyframes (left); visualization of
results that are more interesting for them. Results can be filthe content of the cluster: it is possible to inspect
using twolefacets,
media type
(Fig. 7), and
tered
Chiuso:
immagini
aland
suoconcepts
interno
and interact with each element of the cluster (right).
are clustered based on their similarity to provide a more dinon
visualizzate
verse
andvengono
compact result
set (Fig. 8). Similarity search is ac13 di 19
tivated by dragging and dropping a result item in the search
tions; iv) scripts to ingest and transcode media to the forarea;
if no content-based
similarity
is possible
(e.g. between
Aperto:
le immagini
al suo
interno
mats that can be handled by the HTML5 presentation coman image and an audio file) then keyword similarity is used
ponents that are part of the front-end. These components
vengono
visualizzate
allutente
(e.g. using image tags and audio tags).
have been developed in PHP and Java.
2.2
The Back-End
2.2.1 Lire:
a Solr plugin for high performance CBIR
Cluster
aperto
This component is used to index images and keyframes,
providing CBIR functionality to the framework. It is used
also to create clusters of images and keyframes that are visually similar, thus improving the diversity of retrieved results
di 19 keywords or by similarity.
both when searching13with
While Solr is a well known and well performing text search
engine, the Lire plugin extends it by i) piggy-backing the
content based image features onto indexed text documents,
and ii) hashing the features in a way, so that the inverted
index search can be used for sub-linear retrieval. Basically,
the hash values are used to identify a set of n l candidate
2.2.2
Media transcoding
3.
4.
REFERENCES
http://ant.apache.org/