Sei sulla pagina 1di 62

A Hierarchical Attention Model for Social Contextual Image Recommendation

A Hierarchical Attention Model for Social Contextual Image


Recommendation
A mini project report submitted to

GANDHI INSTITUTE OF TECHNOLOGY AND MANAGEMENT (GITAM)

In partial fulfillment of the requirements for the award of the degree of

BACHELOR OF

TECHNOLOGY IN

COMPUTER SCIENCE AND TECHNOLOGY

Submitted by

YASHWIN SANGHI 2210316164

A SAI PAVITHRAN 2210316104

Under the guidance

of Mr. Raj

Mohammed

Assistant Professor

DEPARTMENT OF COMPUTER SCIENCE AND

ENGINEERING SCHOOL OF TECHNOLOGY

GANDHI INSTITUTE OF TECHNOLOGY AND MANAGEMENT (GITAM)

(Declared as Deemed-to-be-University u/s 3 of UGC Act 1956)

HYDERABAD CAMPUS

OCTOBER-2020
A Hierarchical Attention Model for Social Contextual Image Recommendation

DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING

SCHOOL OF TECHNOLOGY

GANDHI INSTITUTE OF TECHNOLOGY AND MANAGEMENT (GITAM)

(Declared as Deemed-to-be-University u/s 3 of UGC Act 1956)

HYDERABAD CAMPUS

DECLARATION

We hereby declare that the mini project report “Football Tournament Track” submitted to
GITAM (Deemed to be University), Hyderabad in partial fulfillment of the requirements for
the award of the degree of “Bachelor of Technology” in “Computer Science and
Engineering” is an original work carried out by us and has not been submitted earlier to this
or any other university.

Name: Pin:

Yashwin Sanghi 2210316164

Mohd. Shakeeb Uddin 2210316132

A Sai Pavithran 2210316104

i
A Hierarchical Attention Model for Social Contextual Image Recommendation

DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING

SCHOOL OF TECHNOLOGY

GANDHI INSTITUTE OF TECHNOLOGY AND MANAGEMENT (GITAM)

(Declared as Deemed-to-be-University u/s 3 of UGC Act 1956)

HYDERABAD CAMPUS

CERTIFICATE

This is to certify that the project report “FOOTBALL TOURNAMENT TRACK” is


submitted by Yashwin Sanghi (2210316164), Mohd Shakeeb Uddin (2210316132) and A
Sai Pavithran (2210316104) in partial fulfillment of the requirements for the award of the
degree of Bachelor of Technology in Computer Science and Engineering. The project report
has been approved as it satisfies the academic requirements in respect of project work
prescribed for the said degree.

Mr Raj Mohammed Mrs Aruna Sri Prof S Phani Kumar

(Project Guide) (Project Coordinator) Head of the Department

Mrs. G Sri Sowmya Mrs. D Vijay Lakshmi Mrs. P Manasa


(Panel Member) (Panel Member) (Panel Member)
A Hierarchical Attention Model for Social Contextual Image Recommendation

ACKNOWLEDGMENT

Our project would not have been successful without the help of several people. We would
like to thank the personalities who were part of our project in numerous ways, those who
gave us outstanding support from the birth of the project.

We are extremely thankful to our honorable Pro-Vice-Chancellor, Prof. N.Siva Prasad


for providing necessary infrastructure and resources for the accomplishment of our project.
We are highly indebted to Prof. N. Seetharamaiah, Principal, School of Technology, for his
support during the tenure of the project.

We are very much obliged to our beloved Prof. S.Phani Kumar, Head of the
Department of Computer Science & Engineering for providing the opportunity to undertake
this project and encouragement in the completion of this project.

We hereby wish to express our deep sense of gratitude to Mr. Raj Mohammed,
Assistant Professor, Department of Computer Science and Engineering, School of
Technology for the esteemed guidance, moral support and invaluable advice provided by him
for the success of the project.

We are also thankful to all the staff members of the Computer Science and
Engineering department who have co-operated in making our project a success. We would
like to thank all our parents and friends who extended their help, encouragement and moral
support either directly or indirectly in our project work.

Sincerely,

Yashwin Sanghi 2210316164

Mohd Shakeeb Uddin 2210316132

A Sai Pavithran 2210316104


A Hierarchical Attention Model for Social Contextual Image Recommendation

CONTENTS

Declaration...................................................................................................................i

Certificate.....................................................................................................................ii

Acknowledgment..........................................................................................................iii

Contents........................................................................................................................iv

List of Figures...............................................................................................................vii

Abstract.........................................................................................................................1

1. Introduction..............................................................................................................2

1.1 Objective of the Project........................................................................................3

1.2 Limitations............................................................................................................3

1.3 Organization of the Report...................................................................................4

2. Literature Survey......................................................................................................5

2.1 Analysis of Available Web Applications.............................................................5

2.2 Analysis of Available Mobile Applications.........................................................6

2.3 User Perspective...................................................................................................8

2.4 Developer’s Perspective.......................................................................................9

3. System Analysis.......................................................................................................11

3.1 Existing Systems..................................................................................................11

3.1.1 Web Applications............................................................................................11

3.1.2 Disadvantages..................................................................................................11
A Hierarchical Attention Model for Social Contextual Image Recommendation

3.1.3 Mobile Applications........................................................................................11

3.1.4 Disadvantages....................................................................................................11

3.2 Proposed System....................................................................................................12

3.2.1 Scope.................................................................................................................12

3.2.2 Advantages........................................................................................................14

3.3 Software Requirement Specification......................................................................14

3.3.1 Software Requirements......................................................................................15

3.3.2 Hardware Requirements.....................................................................................15

4. System Design..........................................................................................................16

4.1 Introduction...........................................................................................................17

4.2 System Design.......................................................................................................18

4.3 Architecture Design...............................................................................................19

4.4 Detailed Design.....................................................................................................23

5. Implementation.........................................................................................................26

5.1 User Interface Development.................................................................................26

5.2 Backend Connections...........................................................................................28

5.3 Tournament Tree..................................................................................................30

6. Testing......................................................................................................................30

6.1 Introduction...........................................................................................................30

6.1.1 Types of Testing................................................................................................30

6.1.1.1 Manual Testing.............................................................................................30


6.1.1.2 Automation Testing......................................................................................30
A Hierarchical Attention Model for Social Contextual Image Recommendation

6.1.2 Testing Methods..................................................................................................31

6.1.2.1 Static Testing..................................................................................................31

6.1.2.2 Dynamic Testing............................................................................................31

6.1.3 Testing Approaches.............................................................................................31

6.1.3.1 White Box Testing..........................................................................................31

6.1.3.2 Black Box Testing..........................................................................................31

6.1.3.3 Grey Box Testing...........................................................................................31

6.1.4 Testing Levels.....................................................................................................32

6.1.4.1 Unit Testing....................................................................................................32

6.1.4.2 Integration Testing..........................................................................................32

6.1.4.3 System Testing...............................................................................................32

6.1.4.4 Acceptance Testing........................................................................................32

6.2 Test Cases...............................................................................................................33

7. Result Analysis.........................................................................................................36

7.1 Home Page............................................................................................................36

7.2 Freestyle Mode.....................................................................................................36

7.3 Tournament Mode................................................................................................39

8. Conclusion................................................................................................................43

References....................................................................................................................44
A Hierarchical Attention Model for Social Contextual Image Recommendation

LIST OF FIGURES

Fig 2.1 User Interface Of A Well-Known Web Application “www.competize.com” 5

Fig 2.2 Warnings Of Module Failures When on a Low Network on the Web Application 6

Fig 2.3 Football Score Keeper - Vladimir Tumbev An Android Application 7

Fig 2.4 Timer As A Key Feature Of The Application 7

Fig 2.5 Working Procedure of Divide and Conquer Algorithm 7

Fig 4.1 A Simple Software Design Flow 15

Fig 4.2 Class diagram for Football Tournament Track 18

Fig 4.3 Initial Steps 19

Fig 4.4 Freeplay Mode 20

Fig 4.5 Tournament Mode 21

Fig 4.6 Create New Tournament 21

Fig 4.7 Continue Tournament 22

Fig 4.8 Team Creation process 22

Fig 4.9 Tournament Tree as a Max Heap 23

Fig 7.1 Home Page 36

Fig 7.2 Freestyle Welcome Page 36

Fig 7.3 Choose Team Page 37

Fig 7.4 New Team Page 37

Fig 7.5 Score Tracking Module 38


A Hierarchical Attention Model for Social Contextual Image Recommendation

Fig 7.6 New Tournament Start Screen 39

Fig 7.7 Team Information Input Screen 39

Fig 7.8 Tournament Fixture 40

Fig 7.9 Score Tracking 40

Fig 7.10 Penalty 41

Fig 7.11 Score Board aka Tournament Table 41

Fig 7.12 Continue Tournament 42


ABSTRACT

 Image based social networks are among the most popular social networking services in
recent years. With tremendous images uploaded everyday, understanding users’
preferences on user-generated images and making recommendations have become an
urgent need. In fact, many hybrid models have been proposed to fuse various kinds of
side information (e.g., image visual representation, social network) and user-item
historical behavior for enhancing recommendation performance. However, due to the
unique characteristics of the user generated images in social image platforms, the
previous studies failed to capture the complex aspects that influence users’ preferences in
a unified framework. Moreover, most of these hybrid models relied on predefined weights
in combining different kinds of information, which usually resulted in sub-optimal
recommendation performance. To this end, in this paper, we develop a hierarchical
attention model for social contextual image recommendation. In addition to basic latent
user interest modeling in the popular matrix factorization based recommendation we
identify three key aspects (i.e., upload history, social influence, and owner admiration)
that affect each user’s latent preferences, where each aspect summarizes a contextual
factor from the complex relationships between users and images. After that, we design a
hierarchical attention network that naturally mirrors the hierarchical relationship
(elements in each aspects level, and the aspect level) of users’ latent interests with the
identified key aspects. Specifically, by taking embeddings from state-of-the-art deep
learning models that are tailored for each kind of data, the hierarchical attention network
could learn to attend differently to more or less content. Finally, extensive experimental
results on real-world datasets clearly show the superiority of our proposed model.

9
SoT, GITAM- Hyd, Dept of CSE,2016-20
CHAPTER 1

INTRODUCTION

 Naturally, the standard recommendation algorithms provide a direct solution for the image
recommendation task . For example, many classical latent factor based Collaborative
Filtering (CF) algorithms in recommender systems could be applied to deal with user-image
inter- action matrix . Successful as they are, the extreme data sparsity of the user-image
interaction behavior limits the recommendation performance [2], [26]. On one hand, some
recent works proposed to enhance recommendation performance with visual contents learned
from a (pre-trained) deep neural network [18], [49], [5]. On the other hand, as users perform
image preferences in social platforms, some social based recommendation algorithms utilized
the social influence among users to alleviate data sparsity for better recommendation [33],
[24], [3]. In summary, these studies partially solved the data sparsity issue of social-based
image recommendation. Nevertheless, the problem of how to better exploit the unique
characteristics of the social image platforms in a holistical way to enhance recommendation
performance is still under explored. In this paper, we study the problem of understanding
users’ preferences for images and recommending images in social image based platforms.
Fig. 1 shows an example of a typical social image application. Each image is associated with
visual information. Besides showing likeness to images, users are also creators of these
images with the upload behavior. In addition, users connect with others to form a social
network to share their image preferences. The rich heterogeneous contextual data provides
valuable clues to infer users’ preferences to images. Given rich heterogeneous contextual
data, the problem of how to summarize the heterogeneous social contextual aspects that
influence users’ preferences to these highly subjective content is still unclear. What’s more,
in the preference decision process, different users care about different social contextual
aspects for their personalized image preference.

1.1Objective of the Project

 Now-a-days all social networking sites provide support for image upload and sharing with other
users, to allow user to share images various social networking sites used various recommendation
techniques such as content base recommendation (based on past history), collaborative
recommendation (based on user and his friends similarity) and personalized recommendation etc. All
this previous techniques where not using complex social aspects such as Upload History, Social
10
SoT, GITAM- Hyd, Dept of CSE,2016-20
Influence and Owner Admiration, by using this 3 key aspects we can get context relationship between
users and images which helps in perfect recommendation based on relationships. A hierarchical
attention model can be generated with combination of 3 key aspects and Convolution Neural Network
(CNN) where CNN represents image visual model for user and 3 key aspects will represents Users
upload history, social influence and owner matrix.

1.2 Limitations

 [1] Flickr Statistics. https://expandedramblings.com/index.php/flickr-stats/, 2017. [Online;


accessed 20-Jan-2018].
 [2] G. Adomavicius and A. Tuzhilin. Toward the next generation of recommender systems: A
survey of the state-of-the-art and possible extensions. TKDE, 17(6):734–749, 2005.
 [3] A. Anagnostopoulos, R. Kumar, and M. Mahdian. Influence and correlation in social
networks. In KDD, pages 7–15. ACM, 2008.
 [4] D. Bahdanau, K. Cho, and Y. Bengio. Neural machine translation by jointly learning to
align and translate. In ICLR, 2015.
 [5] J. Chen, H. Zhang, X. He, L. Nie, W. Liu, and T.-S. Chua. Attentive collaborative filtering:
Multimedia recommendation with item- and component-level attention. In SIGIR, pages
335–344. ACM, 2017.
 [6] T. Chen, X. He, and M.-Y. Kan. Context-aware image tweet modelling and
recommendation. In MM, pages 1018–1027. ACM, 2016.
 [7] T.-S. Chua, J. Tang, R. Hong, H. Li, Z. Luo, and Y. Zheng. Nus- wide: a real-world web
image database from national university of singapore. In MM, page 48. ACM, 2009.

1.3Organization of the Report

The following report has been organized into different modules. The subjects’
different modules cover are stated below.

 Chapter 2: Literature Survey

This chapter deal with the study of the existing applications. The pros
and cons of the existing application are carefully examined. The user
expectations from the system, their experiences and the overall perception of
the system is stated. The developer’s perceptions are also recorded.
11
SoT, GITAM- Hyd, Dept of CSE,2016-20
 Chapter 3: System Analysis

The existing systems are given a deeper study from the developers view.
The infrastructure, the logic and the implementation methods are analyzed. The
drawbacks and the problems faced by the developers in developing were
deeply studied. The scope of our application was determined. The advantages
of using our application were realised and software requirement specifications
were hence concluded.

 Chapter 4: System Design

The proposed system details are stated in this chapter. The infrastructure
and the workflow of the application were discussed. The algorithms to be used
are also stated here with their pros and cons.

 Chapter 5: Implementations

The implementation details where explained in this chapter. The system


design is converted into code and features are explained.

 Chapter 6: Testing

The developed application is vigorously tested. The metrics that are used
for testing the application are recorded. The expected and the actual behaviour
of the application is reported.

 Chapter 7: Result Analysis

This chapter contains the information of the deployment of the


application in the form of screenshots and navigation is explained.

12
SoT, GITAM- Hyd, Dept of CSE,2016-20
CHAPTER 2

LITERATURE SURVEY

2.1 Literature Survey

 [1] Flickr Statistics. https://expandedramblings.com/index.php/flickr-stats/, 2017.


[Online; accessed 20-Jan-2018].
 [2] G. Adomavicius and A. Tuzhilin. Toward the next generation of recommender
systems: A survey of the state-of-the-art and possible extensions. TKDE, 17(6):734–749,
2005.
 [3] A. Anagnostopoulos, R. Kumar, and M. Mahdian. Influence and correlation in social
networks. In KDD, pages 7–15. ACM, 2008.
 [4] D. Bahdanau, K. Cho, and Y. Bengio. Neural machine translation by jointly learning
to align and translate. In ICLR, 2015.
 [5] J. Chen, H. Zhang, X. He, L. Nie, W. Liu, and T.-S. Chua. Attentive collaborative
filtering: Multimedia recommendation with item- and component-level attention. In
SIGIR, pages 335–344. ACM, 2017.
 [6] T. Chen, X. He, and M.-Y. Kan. Context-aware image tweet modelling and
recommendation. In MM, pages 1018–1027. ACM, 2016.
 [7] T.-S. Chua, J. Tang, R. Hong, H. Li, Z. Luo, and Y. Zheng. Nus- wide: a real-world
web image database from national university of singapore. In MM, page 48. ACM, 2009.

CHAPTER 3

SYSTEM ANALYSIS

3.1 Existing System

 We study the problem of image recommendation in social image based platforms. By


considering the uniqueness of these platforms, we identify three social contextual aspects that
affect users’ preferences from heterogeneous data sources.We design a hierarchical attention
network to model the hierarchical structure of social contextual recommendation. In the
attention networks, we feed embeddings from state-of-the-art deep learning models that are
tailored for each kind of data into the attention networks. Thus, the attention networks could
learn to attend differently based on the rich contextual information for user interest modeling.
We conduct extensive experiments on real-world datasets. The experimental results clearly
show the effectiveness of our proposed model.

3.2Proposed system :-

 in this paper, we design a hierarchical attention model for social image recommendation. The
proposed model is built on the popular latent factor based models, which assumes users and
items could be projected in a low latent space [34]. In our proposed model, for each user, in
addition to basic latent user interest vector, we identify three key aspects (i.e., upload history,
social influence and owner admiration) that affect each user’s preference, where each aspect
summarizes a contextual factor from the complex relationships between users and images.
 HASC is a hierarchical neural network that models users’ preferences for to unknown images
from two attention levels with social contextual modeling. The top layered attention network
depicts the importance of the three contextual aspects (i.e., upload history, social influence
and creator admiration) for users’ decision, which is derived from the bottom layered attention
networks that aggregate the complex elements within each aspect. Given a user a and an
image i with three identified social contextual aspects, we use γal (l = 1, 2, 3) to denote a’s
attentive degree for aspect l on the top layer (denoted as the aspect importance attention with
orange part in the figure). A large attentive degree denotes the current user cares more about
this aspect in image recommendation process.

Advantages :-

 We planned a progressive consideration arrange that normally reflected the various leveled
relationship of clients' advantage given the three distinguished perspectives. Meanwhile, by
encouraging the information inserting from rich heterogeneous information sources, the
progressive consideration systems could figure out how to go to contrastingly to pretty much
significant substance ….

3.2.1Scope

 We designed a hierarchical attention network that naturally mirrored the hierarchical


relationship of users’ interest given the three identified aspects. In the meantime, by feeding the
data embedding from rich heterogeneous data sources, the hierarchical attention networks could
learn to attend differently to more or less important content. Extensive experiments on real-
world datasets clearly demonstrated that our proposed HASC model consistently outperforms
various state-of-the-art baselines for image recommendation.

3.3 Software Requirement Specification

A software requirements specification (SRS) is a detailed description of a


software system to be developed with its functional and non-functional requirements.
The SRS is developed based on the agreement between customers and contractors. It
may include the use cases of how the user is going to interact with the software
system. The software requirement specification document consists of all necessary
requirements required for project development. To develop the software system we
should have a clear understanding of the Software system. To achieve this we need to
continue communication with customers to gather all requirements.

A good SRS defines how Software systems will interact with all internal modules, hardware,
and communication with other programs and human user interactions with a wide range of real-life
scenarios. Using the Software requirements specification (SRS) document on QA lead, managers
create a test plan. It is very important that testers must be cleared with every detail specified in this
document in order to avoid faults in test cases and their expected results.
3.3.1 Software

Requirements

Software Requirements

Programming Language : Python 3.6

Graphical User Interface: HTML5, CSS3 with Bootstrap, JavaScript

Dataset : PlantVillage Dataset

Packages : Tensorflow, Numpy, Pandas, Matplotlib, Scikit-learn

Framework : Django

Tool : Jupyter Notebook

Hardware Requirements

Operating System: Windows 10

Processor : Intel Core i3-2348M

CPU Speed : 2.30 GHz

Memory : 2 GB (RAM)
CHAPTER – 4

PROJECT DESIGN

4.1 Introduction

The design phase of software development deals with transforming the customer
requirements into a form implementable using a programming language. The software design
process can be divided into the following three levels of phases of design:

 System Design
 Architectural Design
 Detailed Design

• System architecture is the conceptual model which defines a system's structure, behavior,
and more views. A description of an architecture is a systematic description and
representation of a system, structured in a way that facilitates thinking about system
mechanisms and behaviors. System architecture can consist of system components that
sand the established sub-systems that work together ….
 INTRODUCTION
This chapter provides the design phase of the Application. To design the project, we use
the UML diagrams. The Unified Modelling Language (UML) is a general- purpose,
developmental, modelling language in the field of software engineering that is intended to
provide a standard way to visualize the design of a system.

USE CASE DIAGRAM

Fig Use case Diagram

The use case diagram is used to represent all the functional use cases that
are involved in the project.
The above diagram represents the main two actors in the project, they are
 User
CLASS DIAGRAM

Fig class diagram

The above mentioned class diagram represents the Chatbot system


workflow model. This diagram has class models with class names as
 User
 Home screen
3.2 SEQUENCE DIAGRAM

Fig 3.5 sequence diagram

The above diagram represents the sequence of flow of actions in the


system.
Activity Diagram :-
Component

Diagram :-
CHAPTER 5

IMPLEMENTATIONS

5.1 User Interface Development

User interface (UI) design and development is the process of making interfaces in
software or computerized devices with a focus on looks or style. Designers aim to create
designs users will find easy to use and pleasurable. UI design typically refers to graphical
user interfaces but also includes others, such as voice-controlled ones.

Sample Code:

Index.html

from __future__ import absolute_import


from __future__ import division
from __future__ import print_function

import argparse
import collections
from datetime import datetime
import hashlib
import os.path
import random
import re
import sys
import tarfile

import numpy as np
from six.moves import urllib
import tensorflow as tf

from tensorflow.python.framework import graph_util


from tensorflow.python.framework import tensor_shape
from tensorflow.python.platform import gfile
from tensorflow.python.util import compat

FLAGS = None
MAX_NUM_IMAGES_PER_CLASS = 2 ** 27 - 1 # ~134M

def create_image_lists(image_dir, testing_percentage, validation_percentage):


if not gfile.Exists(image_dir):
tf.logging.error("Image directory '" + image_dir + "' not found.")
return None
result = collections.OrderedDict()
sub_dirs = [
os.path.join(image_dir,item)
for item in gfile.ListDirectory(image_dir)]
sub_dirs = sorted(item for item in sub_dirs
if gfile.IsDirectory(item))
for sub_dir in sub_dirs:
extensions = ['jpg', 'jpeg', 'JPG', 'JPEG']
file_list = []
dir_name = os.path.basename(sub_dir)
if dir_name == image_dir:
continue
tf.logging.info("Looking for images in '" + dir_name + "'")
for extension in extensions:
file_glob = os.path.join(image_dir, dir_name, '*.' + extension)
file_list.extend(gfile.Glob(file_glob))
if not file_list:
tf.logging.warning('No files found')
continue
if len(file_list) < 20:
tf.logging.warning(
'WARNING: Folder has less than 20 images, which may cause issues.')
elif len(file_list) > MAX_NUM_IMAGES_PER_CLASS:
tf.logging.warning(
'WARNING: Folder {} has more than {} images. Some images will '
'never be selected.'.format(dir_name, MAX_NUM_IMAGES_PER_CLASS))
label_name = re.sub(r'[^a-z0-9]+', ' ', dir_name.lower())
training_images = []
testing_images = []
validation_images = []
for file_name in file_list:
base_name = os.path.basename(file_name)
hash_name = re.sub(r'_nohash_.*$', '', file_name)
hash_name_hashed = hashlib.sha1(compat.as_bytes(hash_name)).hexdigest()
percentage_hash = ((int(hash_name_hashed, 16) %
(MAX_NUM_IMAGES_PER_CLASS + 1)) *
(100.0 / MAX_NUM_IMAGES_PER_CLASS))
if percentage_hash < validation_percentage:
validation_images.append(base_name)
elif percentage_hash < (testing_percentage + validation_percentage):
testing_images.append(base_name)
else:
training_images.append(base_name)
result[label_name] = {
'dir': dir_name,
'training': training_images,
'testing': testing_images,
'validation': validation_images,
}
return result

def get_image_path(image_lists, label_name, index, image_dir, category):


if label_name not in image_lists:
tf.logging.fatal('Label does not exist %s.', label_name)
label_lists = image_lists[label_name]
if category not in label_lists:
tf.logging.fatal('Category does not exist %s.', category)
category_list = label_lists[category]
if not category_list:
tf.logging.fatal('Label %s has no images in the category %s.',
label_name, category)
mod_index = index % len(category_list)
base_name = category_list[mod_index]
sub_dir = label_lists['dir']
full_path = os.path.join(image_dir, sub_dir, base_name)
return full_path

def get_bottleneck_path(image_lists, label_name, index, bottleneck_dir,


category, architecture):
return get_image_path(image_lists, label_name, index, bottleneck_dir,
category) + '_' + architecture + '.txt'

def create_model_graph(model_info):
with tf.Graph().as_default() as graph:
model_path = os.path.join(FLAGS.model_dir, model_info['model_file_name'])
with gfile.FastGFile(model_path, 'rb') as f:
graph_def = tf.GraphDef()
graph_def.ParseFromString(f.read())
bottleneck_tensor, resized_input_tensor = (tf.import_graph_def(
graph_def,
name='',
return_elements=[
model_info['bottleneck_tensor_name'],
model_info['resized_input_tensor_name'],
]))
return graph, bottleneck_tensor, resized_input_tensor

def run_bottleneck_on_image(sess, image_data, image_data_tensor,


decoded_image_tensor, resized_input_tensor,
bottleneck_tensor):
resized_input_values = sess.run(decoded_image_tensor,
{image_data_tensor: image_data})
bottleneck_values = sess.run(bottleneck_tensor,
{resized_input_tensor: resized_input_values})
bottleneck_values = np.squeeze(bottleneck_values)
return bottleneck_values

def maybe_download_and_extract(data_url):
dest_directory = FLAGS.model_dir
if not os.path.exists(dest_directory):
os.makedirs(dest_directory)
filename = data_url.split('/')[-1]
filepath = os.path.join(dest_directory, filename)
if not os.path.exists(filepath):

def _progress(count, block_size, total_size):


sys.stdout.write('\r>> Downloading %s %.1f%%' %
(filename,
float(count * block_size) / float(total_size) * 100.0))
sys.stdout.flush()

filepath, _ = urllib.request.urlretrieve(data_url, filepath, _progress)


print()
statinfo = os.stat(filepath)
tf.logging.info('Successfully downloaded', filename, statinfo.st_size,
'bytes.')
tarfile.open(filepath, 'r:gz').extractall(dest_directory)

def ensure_dir_exists(dir_name):
if not os.path.exists(dir_name):
os.makedirs(dir_name)

bottleneck_path_2_bottleneck_values = {}

def create_bottleneck_file(bottleneck_path, image_lists, label_name, index,


image_dir, category, sess, jpeg_data_tensor,
decoded_image_tensor, resized_input_tensor,
bottleneck_tensor):
tf.logging.info('Creating bottleneck at ' + bottleneck_path)
image_path = get_image_path(image_lists, label_name, index,
image_dir, category)
if not gfile.Exists(image_path):
tf.logging.fatal('File does not exist %s', image_path)
image_data = gfile.FastGFile(image_path, 'rb').read()
try:
bottleneck_values = run_bottleneck_on_image(
sess, image_data, jpeg_data_tensor, decoded_image_tensor,
resized_input_tensor, bottleneck_tensor)
except Exception as e:
raise RuntimeError('Error during processing file %s (%s)' % (image_path,
str(e)))
bottleneck_string = ','.join(str(x) for x in bottleneck_values)
with open(bottleneck_path, 'w') as bottleneck_file:
bottleneck_file.write(bottleneck_string)

def get_or_create_bottleneck(sess, image_lists, label_name, index, image_dir,


category, bottleneck_dir, jpeg_data_tensor,
decoded_image_tensor, resized_input_tensor,
bottleneck_tensor, architecture):
label_lists = image_lists[label_name]
sub_dir = label_lists['dir']
sub_dir_path = os.path.join(bottleneck_dir, sub_dir)
ensure_dir_exists(sub_dir_path)
bottleneck_path = get_bottleneck_path(image_lists, label_name, index,
bottleneck_dir, category, architecture)
if not os.path.exists(bottleneck_path):
create_bottleneck_file(bottleneck_path, image_lists, label_name, index,
image_dir, category, sess, jpeg_data_tensor,
decoded_image_tensor, resized_input_tensor,
bottleneck_tensor)
with open(bottleneck_path, 'r') as bottleneck_file:
bottleneck_string = bottleneck_file.read()
did_hit_error = False
try:
bottleneck_values = [float(x) for x in bottleneck_string.split(',')]
except ValueError:
tf.logging.warning('Invalid float found, recreating bottleneck')
did_hit_error = True
if did_hit_error:
create_bottleneck_file(bottleneck_path, image_lists, label_name, index,
image_dir, category, sess, jpeg_data_tensor,
decoded_image_tensor, resized_input_tensor,
bottleneck_tensor)
with open(bottleneck_path, 'r') as bottleneck_file:
bottleneck_string = bottleneck_file.read()
bottleneck_values = [float(x) for x in bottleneck_string.split(',')]
return bottleneck_values

def cache_bottlenecks(sess, image_lists, image_dir, bottleneck_dir,


jpeg_data_tensor, decoded_image_tensor,
resized_input_tensor, bottleneck_tensor, architecture):
how_many_bottlenecks = 0
ensure_dir_exists(bottleneck_dir)
for label_name, label_lists in image_lists.items():
for category in ['training', 'testing', 'validation']:
category_list = label_lists[category]
for index, unused_base_name in enumerate(category_list):
get_or_create_bottleneck(
sess, image_lists, label_name, index, image_dir, category,
bottleneck_dir, jpeg_data_tensor, decoded_image_tensor,
resized_input_tensor, bottleneck_tensor, architecture)

how_many_bottlenecks += 1
if how_many_bottlenecks % 100 == 0:
tf.logging.info(
str(how_many_bottlenecks) + ' bottleneck files created.')

def get_random_cached_bottlenecks(sess, image_lists, how_many, category,


bottleneck_dir, image_dir, jpeg_data_tensor,
decoded_image_tensor, resized_input_tensor,
bottleneck_tensor, architecture):
class_count = len(image_lists.keys())
bottlenecks = []
ground_truths = []
filenames = []
if how_many >= 0:
for unused_i in range(how_many):
label_index = random.randrange(class_count)
label_name = list(image_lists.keys())[label_index]
image_index = random.randrange(MAX_NUM_IMAGES_PER_CLASS + 1)
image_name = get_image_path(image_lists, label_name, image_index,
image_dir, category)
bottleneck = get_or_create_bottleneck(
sess, image_lists, label_name, image_index, image_dir, category,
bottleneck_dir, jpeg_data_tensor, decoded_image_tensor,
resized_input_tensor, bottleneck_tensor, architecture)
ground_truth = np.zeros(class_count, dtype=np.float32)
ground_truth[label_index] = 1.0
bottlenecks.append(bottleneck)
ground_truths.append(ground_truth)
filenames.append(image_name)
else:
for label_index, label_name in enumerate(image_lists.keys()):
for image_index, image_name in enumerate(
image_lists[label_name][category]):
image_name = get_image_path(image_lists, label_name, image_index,
image_dir, category)
bottleneck = get_or_create_bottleneck(
sess, image_lists, label_name, image_index, image_dir, category,
bottleneck_dir, jpeg_data_tensor, decoded_image_tensor,
resized_input_tensor, bottleneck_tensor, architecture)
ground_truth = np.zeros(class_count, dtype=np.float32)
ground_truth[label_index] = 1.0
bottlenecks.append(bottleneck)
ground_truths.append(ground_truth)
filenames.append(image_name)
return bottlenecks, ground_truths, filenames

def get_random_distorted_bottlenecks(
sess, image_lists, how_many, category, image_dir, input_jpeg_tensor,
distorted_image, resized_input_tensor, bottleneck_tensor):
class_count = len(image_lists.keys())
bottlenecks = []
ground_truths = []
for unused_i in range(how_many):
label_index = random.randrange(class_count)
label_name = list(image_lists.keys())[label_index]
image_index = random.randrange(MAX_NUM_IMAGES_PER_CLASS + 1)
image_path = get_image_path(image_lists, label_name, image_index, image_dir,
category)
if not gfile.Exists(image_path):
tf.logging.fatal('File does not exist %s', image_path)
jpeg_data = gfile.FastGFile(image_path, 'rb').read()
distorted_image_data = sess.run(distorted_image,
{input_jpeg_tensor: jpeg_data})
bottleneck_values = sess.run(bottleneck_tensor,
{resized_input_tensor: distorted_image_data})
bottleneck_values = np.squeeze(bottleneck_values)
ground_truth = np.zeros(class_count, dtype=np.float32)
ground_truth[label_index] = 1.0
bottlenecks.append(bottleneck_values)
ground_truths.append(ground_truth)
return bottlenecks, ground_truths

def should_distort_images(flip_left_right, random_crop, random_scale,


random_brightness):
return (flip_left_right or (random_crop != 0) or (random_scale != 0) or
(random_brightness != 0))

def add_input_distortions(flip_left_right, random_crop, random_scale,


random_brightness, input_width, input_height,
input_depth, input_mean, input_std):
jpeg_data = tf.placeholder(tf.string, name='DistortJPGInput')
decoded_image = tf.image.decode_jpeg(jpeg_data, channels=input_depth)
decoded_image_as_float = tf.cast(decoded_image, dtype=tf.float32)
decoded_image_4d = tf.expand_dims(decoded_image_as_float, 0)
margin_scale = 1.0 + (random_crop / 100.0)
resize_scale = 1.0 + (random_scale / 100.0)
margin_scale_value = tf.constant(margin_scale)
resize_scale_value = tf.random_uniform(tensor_shape.scalar(),
minval=1.0,
maxval=resize_scale)
scale_value = tf.multiply(margin_scale_value, resize_scale_value)
precrop_width = tf.multiply(scale_value, input_width)
precrop_height = tf.multiply(scale_value, input_height)
precrop_shape = tf.stack([precrop_height, precrop_width])
precrop_shape_as_int = tf.cast(precrop_shape, dtype=tf.int32)
precropped_image = tf.image.resize_bilinear(decoded_image_4d,
precrop_shape_as_int)
precropped_image_3d = tf.squeeze(precropped_image, squeeze_dims=[0])
cropped_image = tf.random_crop(precropped_image_3d,
[input_height, input_width, input_depth])
if flip_left_right:
flipped_image = tf.image.random_flip_left_right(cropped_image)
else:
flipped_image = cropped_image
brightness_min = 1.0 - (random_brightness / 100.0)
brightness_max = 1.0 + (random_brightness / 100.0)
brightness_value = tf.random_uniform(tensor_shape.scalar(),
minval=brightness_min,
maxval=brightness_max)
brightened_image = tf.multiply(flipped_image, brightness_value)
offset_image = tf.subtract(brightened_image, input_mean)
mul_image = tf.multiply(offset_image, 1.0 / input_std)
distort_result = tf.expand_dims(mul_image, 0, name='DistortResult')
return jpeg_data, distort_result

def variable_summaries(var):
with tf.name_scope('summaries'):
mean = tf.reduce_mean(var)
tf.summary.scalar('mean', mean)
with tf.name_scope('stddev'):
stddev = tf.sqrt(tf.reduce_mean(tf.square(var - mean)))
tf.summary.scalar('stddev', stddev)
tf.summary.scalar('max', tf.reduce_max(var))
tf.summary.scalar('min', tf.reduce_min(var))
tf.summary.histogram('histogram', var)

def add_final_training_ops(class_count, final_tensor_name, bottleneck_tensor,


bottleneck_tensor_size):
with tf.name_scope('input'):
bottleneck_input = tf.placeholder_with_default(
bottleneck_tensor,
shape=[None, bottleneck_tensor_size],
name='BottleneckInputPlaceholder')
ground_truth_input = tf.placeholder(tf.float32,
[None, class_count],
name='GroundTruthInput')
layer_name = 'final_training_ops'
with tf.name_scope(layer_name):
with tf.name_scope('weights'):
initial_value = tf.truncated_normal(
[bottleneck_tensor_size, class_count], stddev=0.001)

layer_weights = tf.Variable(initial_value, name='final_weights')

variable_summaries(layer_weights)
with tf.name_scope('biases'):
layer_biases = tf.Variable(tf.zeros([class_count]), name='final_biases')
variable_summaries(layer_biases)
with tf.name_scope('Wx_plus_b'):
logits = tf.matmul(bottleneck_input, layer_weights) + layer_biases
tf.summary.histogram('pre_activations', logits)

final_tensor = tf.nn.softmax(logits, name=final_tensor_name)


tf.summary.histogram('activations', final_tensor)

with tf.name_scope('cross_entropy'):
cross_entropy = tf.nn.softmax_cross_entropy_with_logits(
labels=ground_truth_input, logits=logits)
with tf.name_scope('total'):
cross_entropy_mean = tf.reduce_mean(cross_entropy)
tf.summary.scalar('cross_entropy', cross_entropy_mean)

with tf.name_scope('train'):
optimizer = tf.train.GradientDescentOptimizer(FLAGS.learning_rate)
train_step = optimizer.minimize(cross_entropy_mean)

return (train_step, cross_entropy_mean, bottleneck_input, ground_truth_input,


final_tensor)
def add_evaluation_step(result_tensor, ground_truth_tensor):
with tf.name_scope('accuracy'):
with tf.name_scope('correct_prediction'):
prediction = tf.argmax(result_tensor, 1)
correct_prediction = tf.equal(
prediction, tf.argmax(ground_truth_tensor, 1))
with tf.name_scope('accuracy'):
evaluation_step = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
tf.summary.scalar('accuracy', evaluation_step)
return evaluation_step, prediction

def save_graph_to_file(sess, graph, graph_file_name):


output_graph_def = graph_util.convert_variables_to_constants(
sess, graph.as_graph_def(), [FLAGS.final_tensor_name])
with gfile.FastGFile(graph_file_name, 'wb') as f:
f.write(output_graph_def.SerializeToString())
return

def prepare_file_system():
if tf.gfile.Exists(FLAGS.summaries_dir):
tf.gfile.DeleteRecursively(FLAGS.summaries_dir)
tf.gfile.MakeDirs(FLAGS.summaries_dir)
if FLAGS.intermediate_store_frequency > 0:
ensure_dir_exists(FLAGS.intermediate_output_graphs_dir)
return

def create_model_info(architecture):
architecture = architecture.lower()
if architecture == 'inception_v3':
data_url = 'http://download.tensorflow.org/models/image/imagenet/inception-2015-12-05.tgz'
bottleneck_tensor_name = 'pool_3/_reshape:0'
bottleneck_tensor_size = 2048
input_width = 299
input_height = 299
input_depth = 3
resized_input_tensor_name = 'Mul:0'
model_file_name = 'classify_image_graph_def.pb'
input_mean = 128
input_std = 128
elif architecture.startswith('mobilenet_'):
parts = architecture.split('_')
if len(parts) != 3 and len(parts) != 4:
tf.logging.error("Couldn't understand architecture name '%s'",
architecture)
return None
version_string = parts[1]
if (version_string != '1.0' and version_string != '0.75' and
version_string != '0.50' and version_string != '0.25'):
tf.logging.error(
""""The Mobilenet version should be '1.0', '0.75', '0.50', or '0.25',
but found '%s' for architecture '%s'""",
version_string, architecture)
return None
size_string = parts[2]
if (size_string != '224' and size_string != '192' and
size_string != '160' and size_string != '128'):
tf.logging.error(
"""The Mobilenet input size should be '224', '192', '160', or '128',
but found '%s' for architecture '%s'""",
size_string, architecture)
return None
if len(parts) == 3:
is_quantized = False
else:
if parts[3] != 'quantized':
tf.logging.error(
"Couldn't understand architecture suffix '%s' for '%s'", parts[3],
architecture)
return None
is_quantized = True
data_url = 'http://download.tensorflow.org/models/mobilenet_v1_'
data_url += version_string + '_' + size_string + '_frozen.tgz'
bottleneck_tensor_name = 'MobilenetV1/Predictions/Reshape:0'
bottleneck_tensor_size = 1001
input_width = int(size_string)
input_height = int(size_string)
input_depth = 3
resized_input_tensor_name = 'input:0'
if is_quantized:
model_base_name = 'quantized_graph.pb'
else:
model_base_name = 'frozen_graph.pb'
model_dir_name = 'mobilenet_v1_' + version_string + '_' + size_string
model_file_name = os.path.join(model_dir_name, model_base_name)
input_mean = 127.5
input_std = 127.5
else:
tf.logging.error("Couldn't understand architecture name '%s'", architecture)
raise ValueError('Unknown architecture', architecture)

return {
'data_url': data_url,
'bottleneck_tensor_name': bottleneck_tensor_name,
'bottleneck_tensor_size': bottleneck_tensor_size,
'input_width': input_width,
'input_height': input_height,
'input_depth': input_depth,
'resized_input_tensor_name': resized_input_tensor_name,
'model_file_name': model_file_name,
'input_mean': input_mean,
'input_std': input_std,
}

def add_jpeg_decoding(input_width, input_height, input_depth, input_mean,


input_std):
jpeg_data = tf.placeholder(tf.string, name='DecodeJPGInput')
decoded_image = tf.image.decode_jpeg(jpeg_data, channels=input_depth)
decoded_image_as_float = tf.cast(decoded_image, dtype=tf.float32)
decoded_image_4d = tf.expand_dims(decoded_image_as_float, 0)
resize_shape = tf.stack([input_height, input_width])
resize_shape_as_int = tf.cast(resize_shape, dtype=tf.int32)
resized_image = tf.image.resize_bilinear(decoded_image_4d,
resize_shape_as_int)
offset_image = tf.subtract(resized_image, input_mean)
mul_image = tf.multiply(offset_image, 1.0 / input_std)
return jpeg_data, mul_image

def main(_):
# Needed to make sure the logging output is visible.
# See https://github.com/tensorflow/tensorflow/issues/3047
tf.logging.set_verbosity(tf.logging.INFO)

# Prepare necessary directories that can be used during training


prepare_file_system()

# Gather information about the model architecture we'll be using.


model_info = create_model_info(FLAGS.architecture)
if not model_info:
tf.logging.error('Did not recognize architecture flag')
return -1

# Set up the pre-trained graph.


maybe_download_and_extract(model_info['data_url'])
graph, bottleneck_tensor, resized_image_tensor = (
create_model_graph(model_info))

# Look at the folder structure, and create lists of all the images.
image_lists = create_image_lists(FLAGS.image_dir, FLAGS.testing_percentage,
FLAGS.validation_percentage)
class_count = len(image_lists.keys())
if class_count == 0:
tf.logging.error('No valid folders of images found at ' + FLAGS.image_dir)
return -1
if class_count == 1:
tf.logging.error('Only one valid folder of images found at ' +
FLAGS.image_dir +
' - multiple classes are needed for classification.')
return -1

# See if the command-line flags mean we're applying any distortions.


do_distort_images = should_distort_images(
FLAGS.flip_left_right, FLAGS.random_crop, FLAGS.random_scale,
FLAGS.random_brightness)

with tf.Session(graph=graph) as sess:


# Set up the image decoding sub-graph.
jpeg_data_tensor, decoded_image_tensor = add_jpeg_decoding(
model_info['input_width'], model_info['input_height'],
model_info['input_depth'], model_info['input_mean'],
model_info['input_std'])

if do_distort_images:
# We will be applying distortions, so setup the operations we'll need.
(distorted_jpeg_data_tensor,
distorted_image_tensor) = add_input_distortions(
FLAGS.flip_left_right, FLAGS.random_crop, FLAGS.random_scale,
FLAGS.random_brightness, model_info['input_width'],
model_info['input_height'], model_info['input_depth'],
model_info['input_mean'], model_info['input_std'])
else:
# We'll make sure we've calculated the 'bottleneck' image summaries and
# cached them on disk.
cache_bottlenecks(sess, image_lists, FLAGS.image_dir,
FLAGS.bottleneck_dir, jpeg_data_tensor,
decoded_image_tensor, resized_image_tensor,
bottleneck_tensor, FLAGS.architecture)

# Add the new layer that we'll be training.


(train_step, cross_entropy, bottleneck_input, ground_truth_input,
final_tensor) = add_final_training_ops(
len(image_lists.keys()), FLAGS.final_tensor_name, bottleneck_tensor,
model_info['bottleneck_tensor_size'])

# Create the operations we need to evaluate the accuracy of our new layer.
evaluation_step, prediction = add_evaluation_step(
final_tensor, ground_truth_input)

# Merge all the summaries and write them out to the summaries_dir
merged = tf.summary.merge_all()
train_writer = tf.summary.FileWriter(FLAGS.summaries_dir + '/train',
sess.graph)

validation_writer = tf.summary.FileWriter(
FLAGS.summaries_dir + '/validation')

# Set up all our weights to their initial default values.


init = tf.global_variables_initializer()
sess.run(init)

# Run the training for as many cycles as requested on the command line.
for i in range(FLAGS.how_many_training_steps):
# Get a batch of input bottleneck values, either calculated fresh every
# time with distortions applied, or from the cache stored on disk.
if do_distort_images:
(train_bottlenecks,
train_ground_truth) = get_random_distorted_bottlenecks(
sess, image_lists, FLAGS.train_batch_size, 'training',
FLAGS.image_dir, distorted_jpeg_data_tensor,
distorted_image_tensor, resized_image_tensor, bottleneck_tensor)
else:
(train_bottlenecks,
train_ground_truth, _) = get_random_cached_bottlenecks(
sess, image_lists, FLAGS.train_batch_size, 'training',
FLAGS.bottleneck_dir, FLAGS.image_dir, jpeg_data_tensor,
decoded_image_tensor, resized_image_tensor, bottleneck_tensor,
FLAGS.architecture)
# Feed the bottlenecks and ground truth into the graph, and run a training
# step. Capture training summaries for TensorBoard with the `merged` op.
train_summary, _ = sess.run(
[merged, train_step],
feed_dict={bottleneck_input: train_bottlenecks,
ground_truth_input: train_ground_truth})
train_writer.add_summary(train_summary, i)

# Every so often, print out how well the graph is training.


is_last_step = (i + 1 == FLAGS.how_many_training_steps)
if (i % FLAGS.eval_step_interval) == 0 or is_last_step:
train_accuracy, cross_entropy_value = sess.run(
[evaluation_step, cross_entropy],
feed_dict={bottleneck_input: train_bottlenecks,
ground_truth_input: train_ground_truth})
tf.logging.info('%s: Step %d: Train accuracy = %.1f%%' %
(datetime.now(), i, train_accuracy * 100))
tf.logging.info('%s: Step %d: Cross entropy = %f' %
(datetime.now(), i, cross_entropy_value))
validation_bottlenecks, validation_ground_truth, _ = (
get_random_cached_bottlenecks(
sess, image_lists, FLAGS.validation_batch_size, 'validation',
FLAGS.bottleneck_dir, FLAGS.image_dir, jpeg_data_tensor,
decoded_image_tensor, resized_image_tensor, bottleneck_tensor,
FLAGS.architecture))
# Run a validation step and capture training summaries for TensorBoard
# with the `merged` op.
validation_summary, validation_accuracy = sess.run(
[merged, evaluation_step],
feed_dict={bottleneck_input: validation_bottlenecks,
ground_truth_input: validation_ground_truth})
validation_writer.add_summary(validation_summary, i)
tf.logging.info('%s: Step %d: Validation accuracy = %.1f%% (N=%d)' %
(datetime.now(), i, validation_accuracy * 100,
len(validation_bottlenecks)))

# Store intermediate results


intermediate_frequency = FLAGS.intermediate_store_frequency

if (intermediate_frequency > 0 and (i % intermediate_frequency == 0)


and i > 0):
intermediate_file_name = (FLAGS.intermediate_output_graphs_dir +
'intermediate_' + str(i) + '.pb')
tf.logging.info('Save intermediate result to : ' +
intermediate_file_name)
save_graph_to_file(sess, graph, intermediate_file_name)

# We've completed all our training, so run a final test evaluation on


# some new images we haven't used before.
test_bottlenecks, test_ground_truth, test_filenames = (
get_random_cached_bottlenecks(
sess, image_lists, FLAGS.test_batch_size, 'testing',
FLAGS.bottleneck_dir, FLAGS.image_dir, jpeg_data_tensor,
decoded_image_tensor, resized_image_tensor, bottleneck_tensor,
FLAGS.architecture))
test_accuracy, predictions = sess.run(
[evaluation_step, prediction],
feed_dict={bottleneck_input: test_bottlenecks,
ground_truth_input: test_ground_truth})
tf.logging.info('Final test accuracy = %.1f%% (N=%d)' %
(test_accuracy * 100, len(test_bottlenecks)))

if FLAGS.print_misclassified_test_images:
tf.logging.info('=== MISCLASSIFIED TEST IMAGES ===')
for i, test_filename in enumerate(test_filenames):
if predictions[i] != test_ground_truth[i].argmax():
tf.logging.info('%70s %s' %
(test_filename,
list(image_lists.keys())[predictions[i]]))

# Write out the trained graph and labels with the weights stored as
# constants.
save_graph_to_file(sess, graph, FLAGS.output_graph)
with gfile.FastGFile(FLAGS.output_labels, 'w') as f:
f.write('\n'.join(image_lists.keys()) + '\n')

if __name__ == '__main__':
parser = argparse.ArgumentParser()
parser.add_argument(
'--image_dir',
type=str,
default='',
help='Path to folders of labeled images.'
)
parser.add_argument(
'--output_graph',
type=str,
default='/tmp/output_graph.pb',
help='Where to save the trained graph.'
)
parser.add_argument(
'--intermediate_output_graphs_dir',
type=str,
default='/tmp/intermediate_graph/',
help='Where to save the intermediate graphs.'
)
parser.add_argument(
'--intermediate_store_frequency',
type=int,
default=0,
help="""\
How many steps to store intermediate graph. If "0" then will not
store.\
"""
)
parser.add_argument(
'--output_labels',
type=str,
default='/tmp/output_labels.txt',
help='Where to save the trained graph\'s labels.'
)
parser.add_argument(
'--summaries_dir',
type=str,
default='/tmp/retrain_logs',
help='Where to save summary logs for TensorBoard.'
)
parser.add_argument(
'--how_many_training_steps',
type=int,
default=4000,
help='How many training steps to run before ending.'
)
parser.add_argument(
'--learning_rate',
type=float,
default=0.01,
help='How large a learning rate to use when training.'
)
parser.add_argument(
'--testing_percentage',
type=int,
default=10,
help='What percentage of images to use as a test set.'
)
parser.add_argument(
'--validation_percentage',
type=int,
default=10,
help='What percentage of images to use as a validation set.'
)
parser.add_argument(
'--eval_step_interval',
type=int,
default=10,
help='How often to evaluate the training results.'
)
parser.add_argument(
'--train_batch_size',
type=int,
default=100,
help='How many images to train on at a time.'
)
parser.add_argument(
'--test_batch_size',
type=int,
default=-1,
help="""\
How many images to test on. This test set is only used once, to evaluate
the final accuracy of the model after training completes.
A value of -1 causes the entire test set to be used, which leads to more
stable results across runs.\
"""
)
parser.add_argument(
'--validation_batch_size',
type=int,
default=100,
help="""\
How many images to use in an evaluation batch. This validation set is
used much more often than the test set, and is an early indicator of how
accurate the model is during training.
A value of -1 causes the entire validation set to be used, which leads to
more stable results across training iterations, but may be slower on large
training sets.\
"""
)
parser.add_argument(
'--print_misclassified_test_images',
default=False,
help="""\
Whether to print out a list of all misclassified test images.\
""",
action='store_true'
)
parser.add_argument(
'--model_dir',
type=str,
default='/tmp/imagenet',
help="""\
Path to classify_image_graph_def.pb,
imagenet_synset_to_human_label_map.txt, and
imagenet_2012_challenge_label_map_proto.pbtxt.\
"""
)
parser.add_argument(
'--bottleneck_dir',
type=str,
default='/tmp/bottleneck',
help='Path to cache bottleneck layer values as files.'
)
parser.add_argument(
'--final_tensor_name',
type=str,
default='final_result',
help="""\
The name of the output classification layer in the retrained graph.\
"""
)
parser.add_argument(
'--flip_left_right',
default=False,
help="""\
Whether to randomly flip half of the training images horizontally.\
""",
action='store_true'
)
parser.add_argument(
'--random_crop',
type=int,
default=0,
help="""\
A percentage determining how much of a margin to randomly crop off the
training images.\
"""
)
parser.add_argument(
'--random_scale',
type=int,
default=0,
help="""\
A percentage determining how much to randomly scale up the size of the
training images by.\
"""
)
parser.add_argument(
'--random_brightness',
type=int,
default=0,
help="""\
A percentage determining how much to randomly multiply the training image
input pixels up or down by.\
"""
)
parser.add_argument(
'--architecture',
type=str,
default='inception_v3',
help="""\
Which model architecture to use. 'inception_v3' is the most accurate, but
also the slowest. For faster or smaller models, chose a MobileNet with the
form 'mobilenet_<parameter size>_<input_size>[_quantized]'. For example,
'mobilenet_1.0_224' will pick a model that is 17 MB in size and takes 224
pixel input images, while 'mobilenet_0.25_128_quantized' will choose a much
less accurate, but smaller and faster network that's 920 KB on disk and
takes 128x128 images. See https://research.googleblog.com/2017/06/mobilenets-open-source-
models-for.html
for more information on Mobilenet.\
""")
FLAGS, unparsed = parser.parse_known_args()
tf.app.run(main=main, argv=[sys.argv[0]] + unparsed)

label_image.py
import tensorflow as tf, sys

image_path = sys.argv[1]

# Read in the image_data


image_data = tf.gfile.FastGFile(image_path, 'rb').read()

# Loads label file, strips off carriage return


label_lines = [line.rstrip() for line
in tf.gfile.GFile("tf_files\output_labels.txt")]

# Unpersists graph from file


with tf.gfile.FastGFile("tf_files\output_graph.pb", 'rb') as f:
graph_def = tf.GraphDef()
graph_def.ParseFromString(f.read())
_ = tf.import_graph_def(graph_def, name='')

with tf.Session() as sess:


# Feed the image_data as input to the graph and get first prediction
softmax_tensor = sess.graph.get_tensor_by_name('final_result:0')

predictions = sess.run(softmax_tensor, \
{'DecodeJpeg/contents:0': image_data})

# Sort to show labels of first prediction in order of confidence


top_k = predictions[0].argsort()[-len(predictions[0]):][::-1]

for node_id in top_k:


human_string = label_lines[node_id]
score = predictions[0][node_id]
print('%s (score = %.5f)' % (human_string, score))
if human_string == "tomato":
print("Protin : ");
print("Calori: ");

manage.py
#!/usr/bin/env python
import os
import sys

if __name__ == "__main__":
os.environ.setdefault("DJANGO_SETTINGS_MODULE", "ImageClassifier.settings")
try:
from django.core.management import execute_from_command_line
except ImportError as exc:
raise ImportError(
"Couldn't import Django. Are you sure it's installed and "
"available on your PYTHONPATH environment variable? Did you "
"forget to activate a virtual environment?"
)

CHAPTER 6
TESTING

6.1 Introduction

Testing is the process of evaluating a system or its component(s) with the intent to find
whether it satisfies the specified requirements or not. In simple words, testing is executing a system
in order to identify any gaps, errors, or missing requirements contrary to the actual requirements.

6.1.1 Types of Testing

6.1.1.1 Manual Testing

Manual testing is the process of testing software by hand to learn more about it, to
find what is and isn’t working. This usually includes verifying all the features specified in
requirements documents, but often also includes the testers trying the software with the
perspective of their end user’s in mind. Manual test plans vary from fully scripted test cases,
giving testers detailed steps and expected results, through to high-level guides that steer
exploratory testing sessions. There are lots of sophisticated tools on the market to help with
manual testing like Test Pad.

6.1.1.2 Automation Testing

Automation testing is the process of testing the software using an automation tool to
find the defects. In this process, testers execute the test scripts and generate the test results
automatically by using automation tools. Some of the famous automation testing tools for
functional testing are QTP/UFT and Selenium.
6.1.2 Testing Methods
6.1.2.1 Static Testing
It is also known as Verification in Software Testing. Verification is a static method
of checking documents and files. Verification is the process, to ensure that whether we are
building the product right i.e., to verify the requirements which we have and to verify
whether we are developing the product accordingly or not. Activities involved here are
Inspections, Reviews, and Walkthroughs.

6.1.2.2 Dynamic Testing

It is also known as Validation in Software Testing. Validation is a dynamic process


of testing the real product. Validation is the process, whether we are building the right
product i.e., to validate the product which we have developed is right or not. Activities
involved in this are testing the software application.

6.1.3 Testing Approaches

6.1.3.1 White Box Testing

It is also called as Glass Box, Clear Box, and Structural Testing. White Box Testing
is based on the application's internal code structure. In white-box testing, an internal
perspective of the system, as well as programming skills, are used to design test cases. This
testing is usually done at the unit level.

6.1.3.2 Black Box Testing


It is also called as Behavioural/Specification-Based/Input-Output Testing. Black Box
Testing is a software testing method in which testers evaluate the functionality of the
software under test without looking at the internal code structure.

6.1.3.3 Grey Box Testing


The grey box is the combination of both White Box and Black Box Testing. The
tester who works on this type of testing needs to have access to design documents. This
helps to create better test cases in this process.
6.1.4 Testing Levels

6.1.4.1 Unit Testing

Unit Testing is done to check whether the individual modules of the source code are
working properly. I.e. testing each and every unit of the application separately by the
developer in the developer’s environment. It is AKA Module Testing or Component
Testing.

6.1.4.2 Integration Testing

Integration Testing is the process of testing the connectivity or data transfer between
a couple of units tested modules. It is AKA I&T Testing or String Testing. It is subdivided
into the Top-Down Approach, Bottom-Up Approach and Sandwich Approach (Combination
of Top-Down and Bottom-Up).

6.1.4.3 System Testing

It’s a black box testing. Testing the fully integrated application this is also called as
the end to end scenario testing. To ensure that the software works in all intended target
systems. Verify thorough testing of every input in the application to check for desired
outputs. Testing of the users' experiences with the application.

6.1.4.4 Acceptance Testing

To obtain customer sign-off so that software can be delivered and payments


received. Types of Acceptance Testing are Alpha, Beta & Gamma Testing.
6.2 Test Cases

Test Test Case Test Case Test Steps Test Test


Cas Name Description Step Expected Actual Case Priorit
e Id Statu y
s
01 Start the Host the If it We The High High
Applicatio application doesn't cannot application
n and test if it start run the hosts
starts applicati success.
making sure on.
the required
software is
available
02 Home Page Check the If it We The High High
deployment doesn’t cannot application
environmen load. access is running
t for the successfully
properly applicati .
loading the on.
application.
03 Freestyle Verify the If it We The High High
Mode working of doesn’t cannot application
the respond use the displays the
application Freestyle Freestyle
in freestyle mode. Page
mode
04 Data Input Verify if the If it fails We The High High
application to take the cannot application
takes input input or proceed updates the
and updates store in further input to the
the the database.
database. database
CHAPTER 7

RESULT ANALYSIS

Screenshots :-

To run this project double click on ‘run.bat’ file to get below screen

In above screen click on ‘Upload Social Image Sharing Dataset’ button to upload dataset

In above screen uploading social network dataset called ‘dataset.txt’ after uploading dataset will
get below screen

In above screen we can see dataset loaded and this dataset has total 100 user’s records. Now click
on ‘Identify 3 Key Aspects From Dataset’ button to identify 3 keys aspects such as Upload
History, Social Influence and Owner, after identifying 3 aspects a matrix will form up and mark
the matrix values with 1 or 0, if user upload any image then that matrix will have 1 for that image
column other matrix will contains 0. Similarly same will apply for social influence and ratings.
See below matrix for above 3 aspects
In above screen we can see for each user a matrix rows and columns are generated and in that
matrix we can see 0 and 1 values updated based on 3 identified key aspects. All values greater
than 1 are the ratings of that image. Now click on ‘Run CNN Embedding Images & Vector’
button to build CNN model with images and 3 key aspects matrix. See below screen of CNN
model generation

In above screen CNN model generated and in below console screen we can see all CNN details
In above console we can see total 4 layers are created with first layer image size as 126,126 and
second layer image size as 63 and 63 and other layers information also there.
Now click on ‘Run HASC Algorithm & Image Recommendation’ button to upload test image and
then allow CNN and HASC algorithm to predict positive best match images from train model as
recommendation images.

In above screen I am uploading new test image called ‘5.jpg’ and below are the recommendation
based on image content similarity and relationships data which we calculated using 3 key aspects.
Above are the recommendation images for uploaded test image based on image content similarity
and relationship similarity, similarly you can upload other images and test. All above
recommendation images are coming from ‘img’ folder. Now click on ‘Rating For Each User
Graph’ to see rating for each image given by users

In above graph x-axis represents image name and y-axis represents rating for that image.
CHAPTER 8

CONCLUSION

In this paper, we have proposed a hierarchical attentive social contextual model of HASC for
social contextual image recommendation. Specifically, in addition to user interest modeling, we
have identified three social contextual aspects that influence a user’s preference to an image
from heterogeneous data: the upload history aspect, the social influence aspect, and the owner
admiration aspect. We designed a hierarchical attention network that naturally mirrored the
hierarchical relationship of users’ interest given the three identified aspects. In the meantime, by
feeding the data embedding from rich heterogeneous data sources, the hierarchical attention
networks could learn to attend differently to more or less important content. Extensive
experiments on real-world datasets clearly demonstrated that our proposed HASC model
consistently outperforms various state-of-the-art baselines for image recommendation.

REFERENCES
• [1] Flickr Statistics. https://expandedramblings.com/index.php/flickr-stats/, 2017. [Online;
accessed 20-Jan-2018].
• [2] G. Adomavicius and A. Tuzhilin. Toward the next generation of recommender systems:
A survey of the state-of-the-art and possible extensions. TKDE, 17(6):734–749, 2005.
• [3] A. Anagnostopoulos, R. Kumar, and M. Mahdian. Influence and correlation in social
networks. In KDD, pages 7–15. ACM, 2008.
• [4] D. Bahdanau, K. Cho, and Y. Bengio. Neural machine translation by jointly learning to
align and translate. In ICLR, 2015.
• [5] J. Chen, H. Zhang, X. He, L. Nie, W. Liu, and T.-S. Chua. Attentive collaborative
filtering: Multimedia recommendation with item- and component-level attention. In SIGIR,
pages 335–344. ACM, 2017.
• [6] T. Chen, X. He, and M.-Y. Kan. Context-aware image tweet modelling and
recommendation. In MM, pages 1018–1027. ACM, 2016.
• [7] T.-S. Chua, J. Tang, R. Hong, H. Li, Z. Luo, and Y. Zheng. Nus- wide: a real-world
web image database from national university of singapore. In MM, page 48. ACM, 2009.

Potrebbero piacerti anche