Sei sulla pagina 1di 1
% © cC\micc nm A System for Video Recommendation using Visual Saliency,Crowdsourced and Automatic Annotations Andrea Ferracani, Daniele Pezzatini, Marco Bertini, Saverio Meucci, Alberto Del Bimbo MIC - University of Florence, Italy The Project In this demo we present a system for content-based video recommendation thal explo visual saliency To better represent video features and content Visual sationcy is used ‘seloct relevant frames tobe presented in a \wob based interface to tag and annotate video. ‘rames in a social network, itis also employed to summarize video content to Create a more elective vdeo representation used in the recommender system ‘The system exploits automatic annotations from ChN-based classifiers on salt frames and user ‘generated annotations, Users can share and annotate videos a frame level Using concepts derved from Wikipedia. Al these ‘concepts ae clustered in 54 categories using Furry IKcMeane in wo-evels taxonomy of interests and Classified using a semantic dstance with a nearest neighbour approach. ‘A dataset has been collected by hing 812 ‘worker rom the Mieroworkere web sit. The Gataset Is composed by 692 videos, of wien 458 wore annotated with 1956 comments and 4002 annotations. 613 vieos were rated by 950 of 1059 total network users ‘We evaluate the performance ofthe proposed recommender, n terms of RMSE, comparing it {o several baselines 1) standard item-based recommender, that consider user ratings for al the tame ofthe System, 2) ecommender working ona selection of items, based on simian computed Using stem categories only (no BOW content de ripton), '3) recommender working ona selection of items, bated on content similarity (e, automat: ic annotations) computed on randomly solect- fd frames a 4) recommender working on a selection of items, bates on content smarty competed on frames with vaual saliency score above the rane, Visual saliency Is use atthe interface level to propose to the users possible frames of interest through 3 ‘arouse! above the video player, to ease the addition of comments and annotations ‘tthe automatic annotation ove, visual saliency ie used to reduce the computational cost of processing all the frames. Th salen frames to be used inthe system iterface are elected by ening tv peaks of salience ofthe video using a erest detection algorithm, 5) recommender working ona selection of items, Based on content simlarty computed on 2) frames with visual saliency score above i | Reece ern el meres ce 1 oe ern =e Aree ol canee pert crt eee] | = lating foreach category the semantic distance of each annotation Laaa | to the categories ofthe taxonomy. Annotation can have been ‘cde manual of extracted automatics =| = ‘nis semantic relatednoes Betoun th ‘Web Link Based Measure, ‘sing Wiktioation, 1 is obtaned using the Vieual features. video frames are subsampled according to their ‘sual satleney, considering that vseal saliency allows to make a targeted selection of these frames, allowing the system to seale wile maintaining a reasonably dense sampling of video content. ‘The convolutional network implemented uses the LIBCOV2 library, and iti trained on the ImageNet SVR 2014 dataset to detect ‘ou synsets. Video contents represented using a approach, applied to the 1,000 synsets, selecting foreach video the probabiltes that obtained a score above a treshold the demo ‘video ofthe system similar items fo @ user's already-rated Meme par tategory to generate thelist of recommend Videos are presented using a feature vector that concatenates the histogram of the categories ofthe ‘manual comments and the BoW descripton obtained using the CNN clasaier on mast salient ames. wy

Potrebbero piacerti anche