Sei sulla pagina 1di 34

7/13/2017 GitHub - dusty-nv/jetson-inference: Guide to deploying deep-learning inference networks and deep vision primitives with Ten

Features Business Explore Marketplace Pricing This repository Search Sign in or Sign up

dusty-nv / jetson-inference Watch 69 Star 353 Fork 169

Code Issues 44 Pull requests 6 Projects 0 Insights

Dismiss
Join GitHub today
GitHub is home to over 20 million developers working together to host and
review code, manage projects, and build software together.

Sign up

Guide to deploying deep-learning inference networks and deep vision primitives with TensorRT and Jetson TX1/TX2.
https://developer.nvidia.com/embedded
deep-learning inference computer-vision embedded image-recognition object-detection segmentation jetson jetson-tx1 jetson-tx2

226 commits 3 branches 2 releases 2 contributors

Branch: master New pull request Find file Clone or download

dusty-nv fixed segnet-camera crash Latest commit c7728ae 3 days ago

data added ImageNet scripts 2 months ago

detectnet-camera support for pretrained MS-COCO DetectNets 2 months ago

detectnet-console support for pretrained MS-COCO DetectNets 2 months ago

docs fixed artifacts and cropping in images a month ago

imagenet-camera added advanced command-line parsing to imageNet 2 months ago

imagenet-console added googlenet-12 model retrained with 12 classes a month ago

segnet-camera fixed segnet-camera crash 3 days ago

segnet-console added command-line interface to segNet 4 months ago

tools added ImageNet scripts 2 months ago

util fixed segnet-camera crash 3 days ago

CMakeLists.txt added segnet-camera program 20 days ago

CMakePreBuild.sh added Googlenet 12-class subset model a month ago

README.md fixed typo in readme a month ago

detectNet.cpp added advanced command-line parsing to imageNet 2 months ago

detectNet.h support for pretrained MS-COCO DetectNets 2 months ago

imageNet.cpp added googlenet-12 model retrained with 12 classes a month ago

imageNet.cu added segNet for segmentation 8 months ago

imageNet.h added googlenet-12 model retrained with 12 classes a month ago

segNet.cpp fixed segnet-camera crash 3 days ago

segNet.h updated Pascal-VOC model paths 3 months ago

tensorNet.cpp added maxBatchSize to tensorcache filename to allow multiple batch si 3 months ago

https://github.com/dusty-nv/jetson-inference#classifying-images-with-imagenet 1/34
7/13/2017 GitHub - dusty-nv/jetson-inference: Guide to deploying deep-learning inference networks and deep vision primitives with Ten

tensorNet.h added Doxygen build target under make: Nothing to be done for 'docs'. 4 months ago

README.md

Deploying Deep Learning


Welcome to our training guide for inference and deep vision runtime library for NVIDIA DIGITS and Jetson TX1/TX2.

This repo uses NVIDIA TensorRT for efficiently deploying neural networks onto the embedded platform, improving
performance and power efficiency using graph optimizations, kernel fusion, and half-precision FP16 on the Jetson.

Vision primitives, such as imageNet for image recognition, detectNet for object localization, and segNet for
segmentation, inherit from the shared tensorNet object. Examples are provided for streaming from live camera feed and
processing images from disk. See the Deep Vision API Reference Specification for accompanying documentation.

> Read our recent Parallel ForAll post, NVIDIA Jetson TX2 Delivers Twice the Intelligence to the Edge.
> See Image Segmentation models and training guide with aerial drone dataset.
> Try Object Detection training guide using DIGITS & MS-COCO training dataset.
> Use Image Recognition training guide using DIGITS & ImageNet ILSVRC12 dataset.

Table of Contents
DIGITS Workflow
System Setup
Building from Source on Jetson
Classifying Images with ImageNet
Using the Console Program on Jetson
Running the Live Camera Recognition Demo
Re-training the Network with DIGITS
Downloading Image Recognition Dataset
Customizing the Object Classes
Importing Classification Dataset into DIGITS

https://github.com/dusty-nv/jetson-inference#classifying-images-with-imagenet 2/34
7/13/2017 GitHub - dusty-nv/jetson-inference: Guide to deploying deep-learning inference networks and deep vision primitives with Ten

Creating Image Classification Model with DIGITS


Testing Classification Model in DIGITS
Downloading Model Snapshot to Jetson
Loading Custom Models on Jetson
Locating Object Coordinates using DetectNet
Detection Data Formatting in DIGITS
Downloading the Detection Dataset
Importing the Detection Dataset into DIGITS
Creating DetectNet Model with DIGITS
Testing DetectNet Model Inference in DIGITS
Downloading the Model Snapshot to Jetson
DetectNet Patches for TensorRT
Processing Images from the Command Line on Jetson
Multi-class Object Detection Models
Running the Live Camera Detection Demo on Jetson
Image Segmentation with SegNet
Downloading Aerial Drone Dataset
Importing the Aerial Dataset into DIGITS
Generating Pretrained FCN-Alexnet
Training FCN-Alexnet with DIGITS
Testing Inference Model in DIGITS
FCN-Alexnet Patches for TensorRT
Running Segmentation Models on Jetson

Recommended System Requirements

Training GPU: Maxwell or Pascal-based GPU or AWS P2 instance.


Ubuntu 14.04 x86_64 or Ubuntu 16.04 x86_64 (see DIGITS AWS AMI image).

Deployment: Jetson TX2 Developer Kit with JetPack 3.0 or newer (Ubuntu 16.04 aarch64).
Jetson TX1 Developer Kit with JetPack 2.3 or newer (Ubuntu 16.04 aarch64).

note: this branch is verified against the following BSP versions for Jetson TX1/TX2:
> Jetson TX2 - JetPack 3.0 / L4T R27.1 aarch64 (Ubuntu 16.04 LTS)
> Jetson TX1 - JetPack 2.3 / L4T R24.2 aarch64 (Ubuntu 16.04 LTS)
> Jetson TX1 - JetPack 2.3.1 / L4T R24.2.1 aarch64 (Ubuntu 16.04 LTS)

Note that TensorRT samples from the repo are intended for deployment on embedded Jetson TX1/TX2 module, however
when cuDNN and TensorRT have been installed on the host side, the TensorRT samples in the repo can be compiled for PC.

DIGITS Workflow
New to deep neural networks (DNNs) and machine learning? Take this introductory primer on training and inference.

https://github.com/dusty-nv/jetson-inference#classifying-images-with-imagenet 3/34
7/13/2017 GitHub - dusty-nv/jetson-inference: Guide to deploying deep-learning inference networks and deep vision primitives with Ten

Using NVIDIA deep learning tools, it's easy to Get Started training DNNs and deploying them into the field with high
performance. Discrete GPUs are typically used in a server, PC, or laptop for training with DIGITS, while Jetson and integrated
GPU is used by embedded form factors.

NVIDIA DIGITS is used to interactively train network models on annotated datasets in the cloud or PC, while TensorRT and
Jetson are used to deploy runtime inference in the field. TensorRT uses graph optimizations and half-precision FP16 support
to more than double DNN inferencing. Together, DIGITS and TensorRT form an effective workflow for developing and
deploying deep neural networks capable of implementing advanced AI and perception.

System Setup
During this tutorial, we will use a host PC (or AWS), for training DNNs, alongside a Jetson for inference. The host PC will also
serve to flash the Jetson with the latest JetPack. First we'll setup and configure the host PC with the required OS and tools.

Installing Ubuntu on the Host


Download and install Ubuntu 16.04 x86_64 onto the host PC from one of the following locations:

http://releases.ubuntu.com/16.04/ubuntu-16.04.2-desktop-amd64.iso
http://releases.ubuntu.com/16.04/ubuntu-16.04.2-desktop-amd64.iso.torrent

Ubuntu 14.04 x86_64 may also be acceptable with minor modifications later while installing some packages with apt-get.

Running JetPack on the Host


Download the latest JetPack to the host PC. In addition to flashing the Jetson with the latest Board Support Package (BSP),
JetPack automatically installs tools for the host like CUDA Toolkit. See the JetPack Release Notes for the full list of features
and installed packages.

https://github.com/dusty-nv/jetson-inference#classifying-images-with-imagenet 4/34
7/13/2017 GitHub - dusty-nv/jetson-inference: Guide to deploying deep-learning inference networks and deep vision primitives with Ten

After downloading JetPack from the link above, run it from the host PC with the following commands:

$ cd <directory where you downloaded JetPack>


$ chmod +x JetPack-L4T-3.0-linux-x64.run
$ ./JetPack-L4T-3.0-linux-x64.run

The JetPack GUI will start. Follow the step-by-step Install Guide to complete the setup. Near the beginning, JetPack will
confirm which generation Jetson you are developing for.

Select Jetson TX1 if you are using TX1, or Jetson TX2 if you're using TX2, and press Next to continue.

The next screen will list the packages available to be installed. The packages installed to the host are listed at the top under
the Host - Ubuntu dropdown, while those intended for the Jetson are shown near the bottom. You can select or deselect an
individual package for installation by clicking it's Action column.

Since CUDA will be used on the host for training DNNs, it's recommended to select the Full install by click on the radio button
in the top right. Then press Next to begin setup. JetPack will download and then install the sequence of packages. Note that
all the .deb packages are stored under the jetpack_downloads subdirectory if you are to need them later.

https://github.com/dusty-nv/jetson-inference#classifying-images-with-imagenet 5/34
7/13/2017 GitHub - dusty-nv/jetson-inference: Guide to deploying deep-learning inference networks and deep vision primitives with Ten

After the downloads have finished installing, JetPack will enter the post-install phase where the JetPack is flashed with the
L4T BSP. You'll need to connect your Jetson to your host PC via the micro-USB port and cable included in the devkit. Then
enter your Jetson into recovery mode by holding down the Recovery button while pressing and releasing Reset. If you type
lsusb from the host PC after you've connected the micro-USB cable and entered the Jetson into recovery mode, you should
see the NVIDIA device come up under the list of USB devices. JetPack uses the micro-USB connection from the host to flash
the L4T BSP to the Jetson.

After flashing, the Jetson will reboot and if attached to an HDMI display, will boot up to the Ubuntu desktop. After this,
JetPack connects to the Jetson from the host via SSH to install additional packages to the Jetson, like the ARM aarch64
builds of CUDA Toolkit, cuDNN, and TensorRT. For JetPack to be able to reach the Jetson via SSH, the host PC should be
networked to the Jetson via Ethernet. This can be accomplished by running an Ethernet cable directly from the host to the
Jetson, or by connecting both devices to a router or switch the JetPack GUI will ask you to confirm which networking
scenario is being used. See the JetPack Install Guide for the full directions for installing JetPack and flashing Jetson.

Installing NVIDIA Driver on the Host


At this point, JetPack will have flashed the Jetson with the latest L4T BSP, and installed CUDA toolkits to both the Jetson and
host PC. However, the NVIDIA PCIe driver will still need to be installed on the host PC to enable GPU-accelerated training.
Run the following commands from the host PC to install the NVIDIA driver from the Ubuntu repo:

$ sudo apt-get install nvidia-375


$ sudo reboot

Afer rebooting, the NVIDIA driver should be listed under lsmod :

$ lsmod | grep nvidia


nvidia_uvm 647168 0
nvidia_drm 49152 1
nvidia_modeset 790528 4 nvidia_drm
nvidia 12144640 60 nvidia_modeset,nvidia_uvm
drm_kms_helper 167936 1 nvidia_drm
drm 368640 4 nvidia_drm,drm_kms_helper

To verify the CUDA toolkit and NVIDIA driver are working, run some tests that come with the CUDA samples:

$ cd /usr/local/cuda/samples
$ sudo make
$ cd bin/x86_64/linux/release/
$ ./deviceQuery
$ ./bandwidthTest --memory=pinned

Installing cuDNN on the Host


The next step is to install NVIDIA cuDNN libraries on the host PC. Download the libcudnn and libcudnn packages from the
NVIDIA site:

https://developer.nvidia.com/compute/machine-learning/cudnn/secure/v6/prod/8.0_20170307/Ubuntu16_04_x64/libcudnn6_6.0.20-1
https://developer.nvidia.com/compute/machine-learning/cudnn/secure/v6/prod/8.0_20170307/Ubuntu16_04_x64/libcudnn6-dev_6.0.

Then install the packages with the following commands:

$ sudo dpkg -i libcudnn6_6.0.20-1+cuda8.0_amd64.deb


$ sudo dpkg -i libcudnn6-dev_6.0.20-1+cuda8.0_amd64.deb

Installing NVcaffe on the Host

https://github.com/dusty-nv/jetson-inference#classifying-images-with-imagenet 6/34
7/13/2017 GitHub - dusty-nv/jetson-inference: Guide to deploying deep-learning inference networks and deep vision primitives with Ten

NVcaffe is the NVIDIA branch of Caffe with optimizations for GPU. NVcaffe uses cuDNN and is used by DIGITS for training
DNNs. To install it, clone the NVcaffe repo from GitHub and compile from source. Use the NVcaffe-0.15 branch like below.

note: for this tutorial, NVcaffe is only required on the host (for training). During inferencing phase TensorRT is used on
the Jetson and doesn't require caffe.

First some prequisite packages for Caffe are installed, including the Python bindings required by DIGITS:

$ sudo apt-get install --no-install-recommends build-essential cmake git gfortran libatlas-base-dev libboost-filesystem-de
$ sudo pip install --upgrade pip
$ git clone -b caffe-0.15 http://github.com/NVIDIA/caffe
$ cd caffe
$ sudo pip install -r python/requirements.txt
$ mkdir build
$ cd build
$ cmake ../ -DCUDA_USE_STATIC_CUDA_RUNTIME=OFF
$ make --jobs=4
$ make pycaffe

Caffe should now be configured and built. Now edit your user's ~/.bashrc to include the path to your Caffe tree (replace the
paths below to reflect your own):

export CAFFE_ROOT=/home/dusty/workspace/caffe
export PYTHONPATH=/home/dusty/workspace/caffe/python:$PYTHONPATH

Close and re-open the terminal for the changes to take effect.

Installing DIGITS on the Host


NVIDIA DIGITS is a Python-based web service which interactively trains DNNs and manages datasets. As highlighed in the
DIGITS workflow, it runs on the host PC to create the network model during the training phase. The trained model is then
copied from the host PC to the Jetson for the runtime inference phase with TensorRT.

To install DIGITS, first install the prerequisiste packages, then clone the DIGITS repo from GitHub:

$ sudo apt-get install --no-install-recommends graphviz python-dev python-flask python-flaskext.wtf python-gevent python-h
$ git clone http://github.com/nvidia/DIGITS
$ cd DIGITS
$ sudo pip install -r requirements.txt

Starting the DIGITS Server

Assuming that your terminal is still in the DIGITS directory, the webserver can be started by running the digits-devserver
Python script:

$ ./digits-devserver
___ ___ ___ ___ _____ ___
| \_ _/ __|_ _|_ _/ __|
| |) | | (_ || | | | \__ \
|___/___\___|___| |_| |___/ 5.1-dev

2017-04-17 13:19:02 [INFO ] Loaded 0 jobs.

DIGITS will store user jobs (training datasets and model snapshots) under the digits/jobs directory.

To access the interactive DIGITS session, open your web browser and navigate to 0.0.0.0:5000 .

note: by default the DIGITS server will start on port 5000, but the port can be specified by passing the --port
argument to the digits-devserver script.

https://github.com/dusty-nv/jetson-inference#classifying-images-with-imagenet 7/34
7/13/2017 GitHub - dusty-nv/jetson-inference: Guide to deploying deep-learning inference networks and deep vision primitives with Ten

Building from Source on Jetson


Provided along with this repo are TensorRT-enabled deep learning primitives for running Googlenet/Alexnet on live camera
feed for image recognition, pedestrian detection networks with localization capabilities (i.e. that provide bounding boxes), and
segmentation. This repo is intended to be built & run on the Jetson and to accept the network models from the host PC
trained on the DIGITS server.

The latest source can be obtained from GitHub and compiled onboard Jetson TX1/TX2.

note: this branch is verified against the following BSP versions for Jetson TX1/TX2:
> Jetson TX2 - JetPack 3.0 / L4T R27.1 aarch64 (Ubuntu 16.04 LTS)
> Jetson TX1 - JetPack 2.3 / L4T R24.2 aarch64 (Ubuntu 16.04 LTS)
> Jetson TX1 - JetPack 2.3.1 / L4T R24.2.1 aarch64 (Ubuntu 16.04 LTS)

Cloning the Repo

To obtain the repository, navigate to a folder of your choosing on the Jetson. First, make sure git and cmake are installed
locally:

$ sudo apt-get install git cmake

Then clone the jetson-inference repo:

$ git clone http://github.com/dusty-nv/jetson-inference

Configuring with CMake

When cmake is run, a special pre-installation script (CMakePreBuild.sh) is run and will automatically install any
dependencies.

$ cd jetson-inference
$ mkdir build
$ cd build
$ cmake ../

note: the cmake command will launch the CMakePrebuild.sh script which asks for sudo while making sure prerequisite
packages have been installed on the Jetson. The script also downloads the network model snapshots from web
services.

Compiling the Project

Make sure you are still in the jetson-inference/build directory, created above in step #2.

$ cd jetson-inference/build # omit if pwd is already /build from above


$ make

Depending on architecture, the package will be built to either armhf or aarch64, with the following directory structure:

|-build
\aarch64 (64-bit)
\bin where the sample binaries are built to
\include where the headers reside
\lib where the libraries are build to
\armhf (32-bit)
\bin where the sample binaries are built to
\include where the headers reside
\lib where the libraries are build to

https://github.com/dusty-nv/jetson-inference#classifying-images-with-imagenet 8/34
7/13/2017 GitHub - dusty-nv/jetson-inference: Guide to deploying deep-learning inference networks and deep vision primitives with Ten

binaries residing in aarch64/bin, headers in aarch64/include, and libraries in aarch64/lib.

Digging Into the Code

For reference, see the available vision primitives, including imageNet for image recognition and detectNet for object
localization.

/**
* Image recognition with GoogleNet/Alexnet or custom models, using TensorRT.
*/
class imageNet : public tensorNet
{
public:
/**
* Network choice enumeration.
*/
enum NetworkType
{
ALEXNET,
GOOGLENET
};

/**
* Load a new network instance
*/
static imageNet* Create( NetworkType networkType=GOOGLENET );

/**
* Load a new network instance
* @param prototxt_path File path to the deployable network prototxt
* @param model_path File path to the caffemodel
* @param mean_binary File path to the mean value binary proto
* @param class_info File path to list of class name labels
* @param input Name of the input layer blob.
*/
static imageNet* Create( const char* prototxt_path, const char* model_path, const char* mean_binary,
const char* class_labels, const char* input="data",

/**
* Determine the maximum likelihood image class.
* @param rgba float4 input image in CUDA device memory.
* @param width width of the input image in pixels.
* @param height height of the input image in pixels.
* @param confidence optional pointer to float filled with confidence value.
* @returns Index of the maximum class, or -1 on error.
*/
int Classify( float* rgba, uint32_t width, uint32_t height, float* confidence=NULL );
};

Both inherit from the shared tensorNet object which contains common TensorRT code.

Classifying Images with ImageNet


There are multiple types of deep learning networks available, including recognition, detection/localization, and soon
segmentation. The first deep learning capability we're highlighting in this tutorial is image recognition using an 'imageNet'
that's been trained to identify similar objects.

The imageNet object accepts an input image and outputs the probability for each class. Having been trained on ImageNet
database of 1000 objects, the standard AlexNet and GoogleNet networks are downloaded during step 2 from above. As
examples of using imageNet we provide a command-line interface called imagenet-console and a live camera program
called imagenet-camera .

Using the Console Program on Jetson

https://github.com/dusty-nv/jetson-inference#classifying-images-with-imagenet 9/34
7/13/2017 GitHub - dusty-nv/jetson-inference: Guide to deploying deep-learning inference networks and deep vision primitives with Ten

First, try using the imagenet-console program to test imageNet recognition on some example images. It loads an image,
uses TensorRT and the imageNet class to perform the inference, then overlays the classification and saves the output
image.

After building, make sure your terminal is located in the aarch64/bin directory:

$ cd jetson-inference/build/aarch64/bin

Then, classify an example image with the imagenet-console program. imagenet-console accepts 2 command-line
arguments: the path to the input image and path to the output image (with the class overlay printed).

$ ./imagenet-console orange_0.jpg output_0.jpg

$ ./imagenet-console granny_smith_1.jpg output_1.jpg

https://github.com/dusty-nv/jetson-inference#classifying-images-with-imagenet 10/34
7/13/2017 GitHub - dusty-nv/jetson-inference: Guide to deploying deep-learning inference networks and deep vision primitives with Ten

Next, we will use imageNet to classify a live video feed from the Jetson onboard camera.

Running the Live Camera Recognition Demo


Similar to the last example, the realtime image recognition demo is located in /aarch64/bin and is called imagenet-camera . It
runs on live camera stream and depending on user arguments, loads googlenet or alexnet with TensorRT.

$ ./imagenet-camera googlenet # to run using googlenet


$ ./imagenet-camera alexnet # to run using alexnet

The frames per second (FPS), classified object name from the video, and confidence of the classified object are printed to the
openGL window title bar. By default the application can recognize up to 1000 different types of objects, since Googlenet and
Alexnet are trained on the ILSVRC12 ImageNet database which contains 1000 classes of objects. The mapping of names for
the 1000 types of objects, you can find included in the repo under data/networks/ilsvrc12_synset_words.txt

note: by default, the Jetson's onboard CSI camera will be used as the video source. If you wish to use a USB webcam
instead, change the DEFAULT_CAMERA define at the top of imagenet-camera.cpp to reflect the /dev/video V4L2 device
of your USB camera. The model it's tested with is Logitech C920.

https://github.com/dusty-nv/jetson-inference#classifying-images-with-imagenet 11/34
7/13/2017 GitHub - dusty-nv/jetson-inference: Guide to deploying deep-learning inference networks and deep vision primitives with Ten

Re-training the Network with DIGITS


The existing GoogleNet and AlexNet models that are downloaded by the repo are pre-trained on 1000 classes of objects from
the ImageNet ILSVRC12 benchmark.

To recognize a new object class, you can use DIGITS to re-train the network on new data. You can also organize the existing
classes differently, including group multiple subclasses into one. For example in this tutorial we'll take 230 of the 1000
classes, group those into 12 classes and retrain the network.

Let's start by downloading the ILSVRC12 images to work with, or you can substitute your own dataset in an Image Folder.

Downloading Image Recognition Dataset


An image recognition dataset consists of a large number of images sorted by their classification type (typically by directory).
The ILSVRC12 dataset was used in the training of the default GoogleNet and AlexNet models. It's roughly 100GB in size and

https://github.com/dusty-nv/jetson-inference#classifying-images-with-imagenet 12/34
7/13/2017 GitHub - dusty-nv/jetson-inference: Guide to deploying deep-learning inference networks and deep vision primitives with Ten

includes 1 million images over 1000 different classes. The dataset is downloaded to the DIGITS server using the imagenet-
download.py image crawler.

To download the dataset, first make sure you have enough disk space on your DIGITS server (120GB recommended), then
run the following commands from a directory on that machine where you want the dataset stored:

$ wget --no-check-certificate https://nvidia.box.com/shared/static/gzr5iewf5aouhc5exhp3higw6lzhcysj.gz -O ilsvrc12_urls.ta


$ tar -xzvf ilsvrc12_urls.tar.gz
$ wget https://rawgit.com/dusty-nv/jetson-inference/master/tools/imagenet-download.py
$ python imagenet-download.py ilsvrc12_urls.txt . --jobs 100 --retry 3 --sleep 0

In the commands above the list of image URLs along with the scripts are downloaded before launching the crawler.

note: be considerate running the image crawler from a corporate network, IT may flag the activity. It will probably take
overnight on a decent connection to download the 1000 ILSVRC12 classes (100GB).

The crawler will download images to subdirectories that correspond to it's classification. Each image class is stored in it's own
directory, with 1000 directories in total (one for each class in ILSVRC12). The folders are organized with a naming scheme
similar to:

n01440764/
n01443537/
n01484850/
n01491361/
n01494475/
...

These 8-digit ID's prefixed wth N are referred to as the synset ID of the class. The name string of the class can be looked up
in ilsvrc12_synset_words.txt . For example, synset n01484850 great white shark .

Customizing the Object Classes


The dataset that we downloaded in the previous step was used to train the default AlexNet and GoogleNet models with 1000
object classes from several core groups, including different species of birds, plants, fruit, and fish, dog and cat breeds, types
of vehicles, ect. For practicle purposes lets consider a companion to the GoogleNet model which recognizes a dozen core
groups made up of the original 1000 classes (for example, instead of detecting 122 individual breeds of dogs, combining
them all into one common dog class). These 12 core groups may be more practical to use than 1000 individual synsets and
combining across classes results in more training data and stronger classification for the group.

DIGITS expects the data in a hierarchy of folders, so we can create directories for the groups and then symbolically link to the
synsets from ILSVRC12 downloaded above. DIGITS will automatically combine images from all folders under the top-level
groups. The directory structure resembles the following, with the value in parenthesis indicates the number of classes used to
make up the group and the value next to the arrows indicating the synset ID linked to.

ball/ (7)
baseball (n02799071)
basketball (n02802426)
soccer ball (n04254680)
tennis ball (n04409515)
...
bear/ (4)
brown bear (n02132136)
black bear (n02133161)
polar bear (n02134084)
sloth bear (n02134418)
bike/ (3)
bird/ (17)
bottle/ (7)
cat/ (13)
dog/ (122)
fish/ (5)

https://github.com/dusty-nv/jetson-inference#classifying-images-with-imagenet 13/34
7/13/2017 GitHub - dusty-nv/jetson-inference: Guide to deploying deep-learning inference networks and deep vision primitives with Ten
fruit/ (12)
turtle/ (5)
vehicle/ (14)
sign/ (2)

Since there are actually a lot of synsets linked to from ILSVRC12, we provide the imagenet-subset.sh script to generate
the directory structure and links given the path to the dataset. Run the folowing commands from the DIGITS server:

$ wget https://rawgit.com/dusty-nv/jetson-inference/master/tools/imagenet-subset.sh
$ chmod +x imagenet-subset.sh
$ mkdir 12_classes
$ ./imagenet-subset.sh /opt/datasets/imagenet/ilsvrc12 12_classes

In this example the links are created in the 12_classes folder, with the first argument to the script being the path to
ILSVRC12 downloaded in the previous step.

Importing Classification Dataset into DIGITS


Navigate your browser to your DIGITS server instance and choose to create a new Classification Dataset from the drop-
down under the Datasets tab:

Set the Training Images path to the 12_classes folder from the previous step and make the following

% for validation: 10
Group Name: ImageNet
Dataset Name: ImageNet-ILSVRC12-subset

https://github.com/dusty-nv/jetson-inference#classifying-images-with-imagenet 14/34
7/13/2017 GitHub - dusty-nv/jetson-inference: Guide to deploying deep-learning inference networks and deep vision primitives with Ten

Use the Create button at the bottom of the page to launch the dataset import job. The size of the data subset is around
20GB, so depending on server I/O performance it takes 10-15 minutes. Next we'll create the new model and begin training it.

Creating Image Classification Model with DIGITS


When the previous data import job is complete, return to the DIGITS home screen. Select the Models tab and choose to
create a new Classification Model from the drop-down:

Make the following settings in the form:

Select Dataset: ImageNet-ILSVRC12-subset


Subtract Mean: Pixel

https://github.com/dusty-nv/jetson-inference#classifying-images-with-imagenet 15/34
7/13/2017 GitHub - dusty-nv/jetson-inference: Guide to deploying deep-learning inference networks and deep vision primitives with Ten

Standard Networks: GoogleNet


Group Name: ImageNet
Model Name: GoogleNet-ILSVRC12-subset

After selecting a GPU to train on, click the Create button at the bottom to begin training.

https://github.com/dusty-nv/jetson-inference#classifying-images-with-imagenet 16/34
7/13/2017 GitHub - dusty-nv/jetson-inference: Guide to deploying deep-learning inference networks and deep vision primitives with Ten

Testing Classification Model in DIGITS


After the training job completes 30 epochs, the trained model should appear like so:

https://github.com/dusty-nv/jetson-inference#classifying-images-with-imagenet 17/34
7/13/2017 GitHub - dusty-nv/jetson-inference: Guide to deploying deep-learning inference networks and deep vision primitives with Ten

At this point, we can try testing our new model's inference on some example images in DIGITS. On the same page as the plot
above, scroll down under the Trained Models section. Under Test a Single Image , select an image to try (for example,
/ilsvrc12/n02127052/n02127052_1203.jpg ):

Press the Classify One button and you should see a page similar to:

https://github.com/dusty-nv/jetson-inference#classifying-images-with-imagenet 18/34
7/13/2017 GitHub - dusty-nv/jetson-inference: Guide to deploying deep-learning inference networks and deep vision primitives with Ten

The image is classified as the new GoogleNet-12 model as cat , while in the original GoogleNet-1000 it was under Lynx .
This indicates the new model is working ok, because the Lynx category was included in GoogleNet-12's training of cat.

Downloading Model Snapshot to Jetson


Now that we confirmed the trained model is working in DIGITS, let's download and extract the model snapshot to Jetson.
From the browser on your Jetson TX1/TX2, navigate to your DIGITS server and the GoogleNet-ILSVRC12-subset model.
Under the Trained Models section, select the desired snapshot from the drop-down (usually the one with the highest
epoch) and click the Download Model button.

Alternatively, if your Jetson and DIGITS server aren't accessible from the same network, you can use the step above to
download the snapshot to an intermediary machine and then use SCP or USB stick to copy it to Jetson.

Then extract the archive with a command similar to:

tar -xzvf 20170524-140310-8c0b_epoch_30.0.tar.gz

Next we will load our custom snapshot into TensorRT, running on the Jetson.

Loading Custom Models on Jetson


The imagenet-console and imagenet-camera programs that we used before also accept extended command line
parameters for loading a custom model snapshot. Set the $NET variable below to the path to your extracted snapshot:

https://github.com/dusty-nv/jetson-inference#classifying-images-with-imagenet 19/34
7/13/2017 GitHub - dusty-nv/jetson-inference: Guide to deploying deep-learning inference networks and deep vision primitives with Ten

$ NET=networks/GoogleNet-ILSVRC12-subset

$ ./imagenet-console bird_0.jpg output_0.jpg \


--prototxt=$NET/deploy.prototxt \
--model=$NET/snapshot_iter_184080.caffemodel \
--labels=$NET/labels.txt \
--input_blob=data \
--output_blob=softmax

As before, the classification and confidence will be overlayed to the output image. When compared to the output of the
original network, the re-trained GoogleNet-12 makes similar classifications to the original GoogleNet-1000:

The extended command line parameters above also load custom classification models with imagenet-camera .

Locating Object Coordinates using DetectNet


The previous image recognition examples output class probabilities representing the entire input image. The second deep
learning capability we're highlighting in this tutorial is detecting objects, and finding where in the video those objects are
located (i.e. extracting their bounding boxes). This is performed using a 'detectNet' - or object detection / localization network.

The detectNet object accepts as input the 2D image, and outputs a list of coordinates of the detected bounding boxes. To
train the object detection model, first a pretrained ImageNet recognition model (like Googlenet) is used with bounding
coordinate labels included in the training dataset in addition to the source imagery.

The following pretrained DetectNet models are included with the tutorial:

1. ped-100 (single-class pedestrian detector)


2. multiped-500 (multi-class pedestrian + baggage detector)
3. facenet-120 (single-class facial recognition detector)
4. coco-airplane (MS COCO airplane class)
5. coco-bottle (MS COCO bottle class)
6. coco-chair (MS COCO chair class)
7. coco-dog (MS COCO dog class)

https://github.com/dusty-nv/jetson-inference#classifying-images-with-imagenet 20/34
7/13/2017 GitHub - dusty-nv/jetson-inference: Guide to deploying deep-learning inference networks and deep vision primitives with Ten

As with the previous examples, provided are a console program and a camera streaming program for using detectNet.

Detection Data Formatting in DIGITS


Example object detection datasets with include KITTI, MS-COCO, and others. To use the KITTI dataset follow this DIGITS
object detection tutorial with KITTI.

Regardless of dataset, DIGITS uses KITTI metadata format for ingesting the detection bounding labels. These consist of text
files with frame numbers corresponding to image filenames, including contents such as:

dog 0 0 0 528.63 315.22 569.09 354.18 0 0 0 0 0 0 0


sheep 0 0 0 235.28 300.59 270.52 346.55 0 0 0 0 0 0 0

Read more about the folder structure and KITTI label format that DIGITS uses.

Downloading the Detection Dataset


Let's explore using the MS-COCO dataset to train and deploy networks that detect the locations of everyday objects in
camera feeds. See the coco2kitty.py script for converting MS-COCO object classes to KITTI format. Once in DIGITS folder
structure, they can be imported as datasets into DIGITS. Some example classes from MS-COCO already preprocessed in
DIGITS/KITTI format are provided for convienience.

From a terminal on your DIGITS server download and extract sample MS-COCO classes already in DIGITS/KITTI format
here:

$ wget --no-check-certificate https://nvidia.box.com/shared/static/tdrvaw3fd2cwst2zu2jsi0u43vzk8ecu.gz -O coco.tar.gz

HTTP request sent, awaiting response... 200 OK


Length: 5140413391 (4.5G) [application/octet-stream]
Saving to: coco.tar.gz

coco 100%[======================================>] 4.5G 3.33MB/s in 28m 22s

2017-04-17 10:41:19 (2.5 MB/s) - coco.tar.gz saved [5140413391/5140413391]

$ tar -xzvf coco.tar.gz

Included is the training data in DIGITS format for the airplane, bottle, chair, and dog classes. coco2kitty.py can be used to
convert other classes.

Importing the Detection Dataset into DIGITS


Navigate your browser to your DIGITS server instance and choose to create a new Detection Dataset from the drop-down
in the Datasets tab:

In the form fields, specify the following options and paths to the image and label folders under the location where you
extracted the aerial dataset:

Training image folder: coco/train/images/dog

https://github.com/dusty-nv/jetson-inference#classifying-images-with-imagenet 21/34
7/13/2017 GitHub - dusty-nv/jetson-inference: Guide to deploying deep-learning inference networks and deep vision primitives with Ten

Training label folder: coco/train/labels/dog


Validation image folder: coco/val/images/dog
Validation label folder: coco/val/labels/dog
Pad image (Width x Height): 640 x 640
Custom classes: dontcare, dog
Group Name: MS-COCO
Dataset Name: coco-dog

https://github.com/dusty-nv/jetson-inference#classifying-images-with-imagenet 22/34
7/13/2017 GitHub - dusty-nv/jetson-inference: Guide to deploying deep-learning inference networks and deep vision primitives with Ten

Name the dataset whatever you choose and click the Create button at the bottom of the page to launch the importing job.
Next we'll create the new detection model and begin training it.

https://github.com/dusty-nv/jetson-inference#classifying-images-with-imagenet 23/34
7/13/2017 GitHub - dusty-nv/jetson-inference: Guide to deploying deep-learning inference networks and deep vision primitives with Ten

Creating DetectNet Model with DIGITS


When the previous data import job is complete, return to the DIGITS home screen. Select the Models tab and choose to
create a new Detection Model from the drop-down:

Make the following settings in the form:

Select Dataset: coco-dog


Training epochs: 100
Subtract Mean: none
Solver Type: Adam
Base learning rate: 2.5e-05
Select Show advanced learning options
Policy: Exponential Decay
Gamma: 0.99

Selecting DetectNet Batch Size

DetectNet's network default batch size of 10 consumes up to 12GB GPU memory during training. However by using the
Batch Accumulation field, you can also train DetectNet on a GPU with less than 12GB memory. See the table below
depending on the amount of GPU memory available in your DIGITS server:

GPU Memory Batch Size Batch Accumulation

4GB 2 5

8GB 5 2

12GB or larger [network defaults] (10) Leave blank (1)

If you're training on a card with 12GB of memory or more, leave the Batch Size as the default and leave the Batch
Accumulation blank. For GPUs with less memory, use the settings from above.

Specifying the DetectNet Prototxt

In the network area select the Custom Network tab and then copy/paste the contents of detectnet.prototxt

https://github.com/dusty-nv/jetson-inference#classifying-images-with-imagenet 24/34
7/13/2017 GitHub - dusty-nv/jetson-inference: Guide to deploying deep-learning inference networks and deep vision primitives with Ten

The DetectNet prototxt is located at data/networks/detectnet.prototxt in the repo.

Training the Model with Pretrained Googlenet

Since DetectNet is derived from Googlenet, it is strongly recommended to use pre-trained weights from Googlenet as this will
help speed up and stabilize training significantly. Download the Googlenet model from here or by running the following
command from your DIGITS server:

wget http://dl.caffe.berkeleyvision.org/bvlc_googlenet.caffemodel

Then specify the path to your Googlenet under the Pretrained Model field.

Select a GPU to train on and set a name and group for the model:

Group Name MS-COCO


Model Name DetectNet-COCO-Dog

Finally, click the Create button at the bottom to begin training.

https://github.com/dusty-nv/jetson-inference#classifying-images-with-imagenet 25/34
7/13/2017 GitHub - dusty-nv/jetson-inference: Guide to deploying deep-learning inference networks and deep vision primitives with Ten

Testing DetectNet Model Inference in DIGITS


Leave the training job to run for a while, say 50 epochs, until the mAP ( Mean Average Precision ) plot begins to increase.
Note that due to the way mAP is calculated by the DetectNet loss function, the scale of mAP isn't necessarily 0-100, and
even an mAP between 5 and 10 may indicate the model is functional. With the size of the example COCO datasets we are
using, it should take a couple hours training on a recent GPU before all 100 epochs are complete.

https://github.com/dusty-nv/jetson-inference#classifying-images-with-imagenet 26/34
7/13/2017 GitHub - dusty-nv/jetson-inference: Guide to deploying deep-learning inference networks and deep vision primitives with Ten

https://github.com/dusty-nv/jetson-inference#classifying-images-with-imagenet 27/34
7/13/2017 GitHub - dusty-nv/jetson-inference: Guide to deploying deep-learning inference networks and deep vision primitives with Ten

At this point, we can try testing our new model's inference on some example images in DIGITS. On the same page as the plot
above, scroll down under the Trained Models section. Set the Visualization Model to Bounding Boxes and under Test
a Single Image , select an image to try (for example, /coco/val/images/dog/000074.png ):

Press the Test One button and you should see a page similar to:

Downloading the Model Snapshot to Jetson


Next, download and extract the trained model snapshot to Jetson. From the browser on your Jetson TX1/TX2, navigate to
your DIGITS server and the DetectNet-COCO-Dog model. Under the Trained Models section, select the desired snapshot
from the drop-down (usually the one with the highest epoch) and click the Download Model button.

Alternatively, if your Jetson and DIGITS server aren't accessible from the same network, you can use the step above to
download the snapshot to an intermediary machine and then use SCP or USB stick to copy it to Jetson.

Then extract the archive with a command similar to:

tar -xzvf 20170504-190602-879f_epoch_100.0.tar.gz

DetectNet Patches for TensorRT


In the original DetectNet prototxt exists a Python clustering layer which isn't available in TensorRT and should be deleted
from the deploy.prototxt included in the snapshot. In this repo the detectNet class handles the clustering as opposed to
Python.

At the end of deploy.prototxt , delete the layer named cluster :

layer {
name: "cluster"
type: "Python"
bottom: "coverage"
bottom: "bboxes"
top: "bbox-list"
python_param {
module: "caffe.layers.detectnet.clustering"
layer: "ClusterDetections"
param_str: "640, 640, 16, 0.6, 2, 0.02, 22, 1"

https://github.com/dusty-nv/jetson-inference#classifying-images-with-imagenet 28/34
7/13/2017 GitHub - dusty-nv/jetson-inference: Guide to deploying deep-learning inference networks and deep vision primitives with Ten
}
}

Without this Python layer, the snapshot can now be imported into TensorRT.

Processing Images from the Command Line on Jetson


To process test images with detectNet and TensorRT, use the detectnet-console program. detectnet-console accepts
command-line arguments representing the path to the input image and path to the output image (with the bounding box
overlays rendered). Some test images are also included with the repo.

To specify your model that you downloaded from DIGITS, use the syntax to detectnet-console below. First, for
convienience, set the path to your extracted snapshot into a $NET variable:

$ NET=20170504-190602-879f_epoch_100

$ ./detectnet-console dog_0.jpg output_0.jpg \


--prototxt=$NET/deploy.prototxt \
--model=$NET/snapshot_iter_38600.caffemodel \
--input_blob=data \
--output_cvg=coverage \
--output_bbox=bboxes

note: the input_blob , output_cvg , and output_bbox arguments may be omitted if your DetectNet layer names
match the defaults above (i.e. if you are using the prototxt from following this tutorial). These optional command line
parameters are provided if you are using a customized DetectNet with different layer names.

Launching With a Pretrained Model

Alternatively, to load one of the pretrained snapshots that comes with the repo, you can specify the pretrained model name as
the 3rd argument to detectnet-console :

$ ./detectnet-console dog_1.jpg output_1.jpg coco-dog

The above command will process dog_1.jpg, saving it to output_1.jpg, using the pretrained DetectNet-COCO-Dog model.
This is a shortcut of sorts so you don't need to wait for the model to complete training if you don't want to.

Pretrained DetectNet Models Available

Below is a table of the pretrained DetectNet snapshots downloaded with the repo (located in the data/networks directory
after running cmake step) and the associated argument to detectnet-console used for loading the pretrained model:

DIGITS model CLI argument classes

DetectNet-COCO-Airplane coco-airplane airplanes

DetectNet-COCO-Bottle coco-bottle bottles

DetectNet-COCO-Chair coco-chair chairs

DetectNet-COCO-Dog coco-dog dogs

ped-100 pednet pedestrians

multiped-500 multiped pedestrians, luggage

facenet-120 facenet faces

https://github.com/dusty-nv/jetson-inference#classifying-images-with-imagenet 29/34
7/13/2017 GitHub - dusty-nv/jetson-inference: Guide to deploying deep-learning inference networks and deep vision primitives with Ten

These all also have the python layer patch above already applied.

Running Other MS-COCO Models on Jetson

Let's try running some of the other COCO models. The training data for these are all included in the dataset downloaded
above. Although the DIGITS training example above was for the coco-dog model, the same procedure can be followed to
train DetectNet on the other classes included in the sample COCO dataset.

$ ./detectnet-console bottle_0.jpg output_2.jpg coco-bottle

$ ./detectnet-console airplane_0.jpg output_3.jpg coco-airplane

Running Pedestrian Models on Jetson

Included in the repo are also DetectNet models pretrained to detect humans. The pednet and multiped models
recognized pedestrians while facenet recognizes faces (from FDDB). Here's an example of detecting multiple humans
simultaneously in a crowded space:

$ ./detectnet-console peds-007.png output_7.png multiped

Multi-class Object Detection Models


When using the multiped model ( PEDNET_MULTI ), for images containing luggage or baggage in addition to pedestrians, the
2nd object class is rendered with a green overlay.

$ ./detectnet-console peds-008.png output_8.png multiped

Running the Live Camera Detection Demo on Jetson


Similar to the previous example, detectnet-camera runs the object detection networks on live video feed from the Jetson
onboard camera. Launch it from command line along with the type of desired network:

$ ./detectnet-camera coco-bottle # detect bottles/soda cans in the camera


$ ./detectnet-camera coco-dog # detect dogs in the camera
$ ./detectnet-camera multiped # run using multi-class pedestrian/luggage detector
$ ./detectnet-camera pednet # run using original single-class pedestrian detector
$ ./detectnet-camera facenet # run using facial recognition network
$ ./detectnet-camera # by default, program will run using multiped

note: to achieve maximum performance while running detectnet, increase the Jetson clock limits by running the script:
sudo ~/jetson_clocks.sh

note: by default, the Jetson's onboard CSI camera will be used as the video source. If you wish to use a USB webcam
instead, change the DEFAULT_CAMERA define at the top of detectnet-camera.cpp to reflect the /dev/video V4L2 device
of your USB camera and recompile. The webcam model it's tested with is Logitech C920.

https://github.com/dusty-nv/jetson-inference#classifying-images-with-imagenet 30/34
7/13/2017 GitHub - dusty-nv/jetson-inference: Guide to deploying deep-learning inference networks and deep vision primitives with Ten

Image Segmentation with SegNet


The third deep learning capability we're highlighting in this tutorial is image segmentation. Segmentation is based on image
recognition, except the classifications occur at the pixel level as opposed to classifying entire images as with image
recognition. This is accomplished by convolutionalizing a pre-trained imageNet recognition model (like Alexnet), which turns it
into a fully-convolutional segmentation model capable of per-pixel labelling. Useful for environmental sensing and collision
avoidance, segmentation yields dense per-pixel classification of many different potential objects per scene, including scene
foregrounds and backgrounds.

The segNet object accepts as input the 2D image, and outputs a second image with the per-pixel classification mask
overlay. Each pixel of the mask corresponds to the class of object that was classified.

note: see the DIGITS semantic segmentation example for more background info on segmentation.

Downloading Aerial Drone Dataset


As an example of image segmentation, we'll work with an aerial drone dataset that separates ground terrain from the sky. The
dataset is in First Person View (FPV) to emulate the vantage point of a drone in flight and train a network that functions as an
autopilot guided by the terrain that it senses.

To download and extract the dataset, run the following commands from the host PC running the DIGITS server:

$ wget --no-check-certificate https://nvidia.box.com/shared/static/ft9cc5yjvrbhkh07wcivu5ji9zola6i1.gz -O NVIDIA-Aerial-Dr

HTTP request sent, awaiting response... 200 OK


Length: 7140413391 (6.6G) [application/octet-stream]
Saving to: NVIDIA-Aerial-Drone-Dataset.tar.gz

NVIDIA-Aerial-Drone-Datase 100%[======================================>] 6.65G 3.33MB/s in 44m 44s

2017-04-17 14:11:54 (2.54 MB/s) - NVIDIA-Aerial-Drone-Dataset.tar.gz saved [7140413391/7140413391]

$ tar -xzvf NVIDIA-Aerial-Drone-Dataset.tar.gz

The dataset includes various clips captured from flights of drone platforms, but the one we'll be focusing on in this tutorial is
under FPV/SFWA . Next we'll create the training database in DIGITS before training the model.

Importing the Aerial Dataset into DIGITS


First, navigate your browser to your DIGITS server instance and choose to create a new Segmentation Dataset from the
drop-down in the Datasets tab:

In the dataset creation form, specify the following options and paths to the image and label folders under the location where
you extracted the aerial dataset:

Feature image folder: NVIDIA-Aerial-Drone-Dataset/FPV/SFWA/720p/images


Label image folder: NVIDIA-Aerial-Drone-Dataset/FPV/SFWA/720p/labels
set % for validation to 1%
Class labels: NVIDIA-Aerial-Drone-Dataset/FPV/SFWA/fpv-labels.txt
Color map: From text file
Feature Encoding: None
Label Encoding: None

https://github.com/dusty-nv/jetson-inference#classifying-images-with-imagenet 31/34
7/13/2017 GitHub - dusty-nv/jetson-inference: Guide to deploying deep-learning inference networks and deep vision primitives with Ten

Name the dataset whatever you choose and click the Create button at the bottom of the page to launch the importing job.
Next we'll create the new segmentation model and begin training.

Generating Pretrained FCN-Alexnet


Fully Convolutional Network (FCN) Alexnet is the network topology that we'll use for segmentation models with DIGITS and
TensorRT. See this Parallel ForAll article about the convolutionalizing process. A new feature to DIGITS5 was supporting
segmentation datasets and training models. A script is included with the DIGITS semantic segmentation example which
converts the Alexnet model into FCN-Alexnet. This base model is then used as a pre-trained starting point for training future
FCN-Alexnet segmentation models on custom datasets.

To generate the pre-trained FCN-Alexnet model, open a terminal, navigate to the DIGITS semantic-segmantation example,
and run the net_surgery script:

$ cd DIGITS/examples/semantic-segmentation
$ ./net_surgery.py
Downloading files (this might take a few minutes)...
Downloading https://raw.githubusercontent.com/BVLC/caffe/rc3/models/bvlc_alexnet/deploy.prototxt...
Downloading http://dl.caffe.berkeleyvision.org/bvlc_alexnet.caffemodel...
Loading Alexnet model...
...
Saving FCN-Alexnet model to fcn_alexnet.caffemodel

Training FCN-Alexnet with DIGITS


When the previous data import job is complete, return to the DIGITS home screen. Select the Models tab and choose to
create a new Segmentation Model from the drop-down:

In the model creation form, select the dataset you previously created. Set Subtract Mean to None and the Base Learning
Rate to 0.0001 . To set the network topology in DIGITS, select the Custom Network tab and make sure the Caffe sub-tab
is selected. Copy/paste the FCN-Alexnet prototxt into the text box. Finally, set the Pretrained Model to the output that the
net_surgery generated above: DIGITS/examples/semantic-segmentation/fcn_alexnet.caffemodel

Give your aerial model a name and click the Create button at the bottom of the page to start the training job. After about 5
epochs, the Accuracy plot (in orange) should ramp up and the model becomes usable:

At this point, we can try testing our new model's inference on some example images in DIGITS.

Testing Inference Model in DIGITS


Before transfering the trained model to Jetson, let's test it first in DIGITS. On the same page as previous plot, scroll down
under the Trained Models section. Set the Visualization Model to Image Segmentation and under Test a Single
Image , select an image to try (for example /NVIDIA-Aerial-Drone-Dataset/FPV/SFWA/720p/images/0428.png ):

Press Test One and you should see a display similar to:

Next, download and extract the trained model snapshot to Jetson.

FCN-Alexnet Patches for TensorRT

https://github.com/dusty-nv/jetson-inference#classifying-images-with-imagenet 32/34
7/13/2017 GitHub - dusty-nv/jetson-inference: Guide to deploying deep-learning inference networks and deep vision primitives with Ten

There exist a couple non-essential layers included in the original FCN-Alexnet which aren't supported in TensorRT and
should be deleted from the deploy.prototxt included in the snapshot.

At the end of deploy.prototxt , delete the deconv and crop layers:

layer {
name: "upscore"
type: "Deconvolution"
bottom: "score_fr"
top: "upscore"
param {
lr_mult: 0.0
}
convolution_param {
num_output: 21
bias_term: false
kernel_size: 63
group: 21
stride: 32
weight_filler {
type: "bilinear"
}
}
}
layer {
name: "score"
type: "Crop"
bottom: "upscore"
bottom: "data"
top: "score"
crop_param {
axis: 2
offset: 18
}
}

And on line 24 of deploy.prototxt , change pad: 100 to pad: 0 . Finally copy the fpv-labels.txt and fpv-deploy-
colors.txt from the aerial dataset to your model snapshot folder on Jetson. Your FCN-Alexnet model snapshot is now
compatible with TensorRT. Now we can run it on Jetson and perform inference on images.

Running Segmentation Models on Jetson


To test a custom segmentation network model snapshot on the Jetson, use the command line interface to test the segnet-
console program.

First, for convienience, set the path to your extracted snapshot into a $NET variable:

$ NET=20170421-122956-f7c0_epoch_5.0

$ ./segnet-console drone_0428.png output_0428.png \


--prototxt=$NET/deploy.prototxt \
--model=$NET/snapshot_iter_22610.caffemodel \
--labels=$NET/fpv-labels.txt \
--colors=$NET/fpv-deploy-colors.txt \
--input_blob=data \
--output_blob=score_fr

This runs the specified segmentation model on a test image downloaded with the repo.

In addition to the pre-trained aerial model from this tutorial, the repo also includes pre-trained models on other segmentation
datasets, including Cityscapes, SYNTHIA, and Pascal-VOC.

https://github.com/dusty-nv/jetson-inference#classifying-images-with-imagenet 33/34
7/13/2017 GitHub - dusty-nv/jetson-inference: Guide to deploying deep-learning inference networks and deep vision primitives with Ten

Extra Resources
In this area, links and resources for deep learning developers are listed:

Appendix
NVIDIA Deep Learning Institute Introductory QwikLabs
Building nvcaffe
Other Examples
ros_deep_learning - TensorRT inference ROS nodes

2017 GitHub, Inc. Terms Privacy Security Status Help Contact GitHub API Training Shop Blog About

https://github.com/dusty-nv/jetson-inference#classifying-images-with-imagenet 34/34

Potrebbero piacerti anche