HTM For Dummies

Hierarchical Temporal Memory: Theory behind the Algorithm http://kaustabpal.
com/hierarchical-temporal-memory/
Hierarchical
Temporal Memory:
Theory behind the
Algorithm
By KAUSTAB PAL June 2, 2017
Right from our birth we humans are either GET THE LATEST
UPDATES
learning about something or making
predictions about what might come next.
First Name
That is because our brain never sit idle. We
are the most intelligent species on this
Your Email
planet and trivial tasks like recognizing
objects, understanding what someone is
Sign up
saying or recalling the next line of our
favorite songs are really easy for us to do.
1 of 15 2/6/17 13:25
Hierarchical Temporal Memory: Theory behind the Algorithm http://kaustabpal.com/hierarchical-temporal-memory/
up with only a few algorithms that can

START HERE BLOG ABOUT CONTACT
achieve human-like performance on a
Like Page
computer. To make intelligent systems we
rst need to learn how intelligence works Be the first of your friends to like this
in the rst place. Hierarchical Temporal

Memory is a unique and new approach to
Articial Intelligence that starts from the
YOU MIGHT LIKE TO
neuroscience of the neocortex in our
READ
brains.
IS SYMBIOSIS THE ANSWER
TO THE A.I. THREAT?
April 30, 2017
WHAT IS THE TURING

TEST?
April 23, 2017
THE PHILOSOPHY OF
ARTIFICIAL INTELLIGENCE
April 13, 2017
The neocortex is the wrinkled portion on the top of
2 of 15 2/6/17 13:25

portion where all our memories are stored and it is
the neocortex only that gives every individual their
unique identities. Hierarchical Temporal
Memory(HTM) is the name we use to to describe the
theory of how our neocortex functions. It is also the
name used to describe the technology used in
machines that work on the neocortical principles.
The framework of HTM has three prominent

features.
1. Hierarchy: The Neocortex is divided into

several regions and each region is logically
linked together in a hierarchical structure.
Data from our senses comes through the
lower regions in the hierarchy, gets processed
and then move upwards in the hierarchy.
2. Temporal: The input and output of the

neocortex are temporal patterns, i.e. they are
changing several times in every second. Each
layer in the neocortex learns a time-based
model of its input and then it learns to predict
the changing input stream. This type of
learning is called online learning.
3. Memory: The neocortex is a memory based

system. Each layer in the neocortex learns the
structure of the world from the inputs by
3 of 15 2/6/17 13:25
The inputs in our Neocortex are data from all our

sense organs. However it is believed that the inputs
come in a sparse distributed form.
Sparse Distributed representations:

Sparse Distributed Representations(SDR) are a large
vector of binary bits of which 98% are zeros and only
2% are ones. Unlike binary numbers, each ones in a
SDR convey some meaning about the data it is
representing.
It was discovered that the neocortex uses a common

set of algorithms to perform many dierent
functions. We wont be discussing how these
algorithms are implemented biologically in our
brains as it is beyond the scope of this article,
however we will be breaking down the working of
these algorithms from a computer science
perspective.
4 of 15 2/6/17 13:25
A typical HTM model consists of a sensory region, a

spatial pooler, a temporal memory and a classier.
The fundamental units in the HTM model are cells
and the cells are arranged in columns. Each cell
represents an individual neuron.
The sensory region takes inputs from the sensors

and converts them into binary vectors. These binary
5 of 15 2/6/17 13:25

vectors are then passed into the spatial pooler.

Spatial Pooler
The spatial pooler is one of the main component of
an HTM system. The job of the Spatial Pooler is to
take the binary vectors as input and outputs a sparse
distributed representation. The output of the spatial
pooler is always of xed size and similar inputs in a
spatial pooler will always have similar outputs.
When an input is received each column of the HTM is

connected to some random subset of the input
bits.The subset of an input bit that can activate a
column is called the potential pool of that column.
6 of 15 2/6/17 13:25
In the above representation, we can see that the rst

column is connected to three bits of the input vector
so these three bits form the potential pool of the
column. Potential pool means that if the same input
comes in the future, these are the bits in that input
vector that can activate that column.
Each connection between the column and the input

bits is called a synapse. Each synapse has a value
associated with it called the permanence and the
value of the permanence ranges from 0.0 to 1.0.
When the permanence of a synapse is below a
certain threshold it will not respond to that input bit.
For each column we calculate the number of

synapses that are connected to ones and have a
permanence above a certain threshold. Each column
gets an overlap score based on that number.
7 of 15 2/6/17 13:25
In the above example for the given input vector

there are two connected synapses with the rst
column, however only one of those connected
synapses is one and therefore the overlap score of
the rst column is one.
To select the columns that will represent a given

input vector we follow the following procedure:
1. Compute the overlap score of all the columns.
2. Since we want a sparse distributed

representation we take the top two percent of
the columns having the highest overlap
scores.
3. After we select the top two percent of the

columns, they are called the active columns
and they are the spatial representation of the
input. Similar inputs will have similar spatial
representations.
4. For each of the active columns we

increment/decrement the permanence of the
synapses. The below gure explains how the
synapses gets incremented/decremented.
8 of 15 2/6/17 13:25
When we turn learning on in the column, we want to

adjust the column such that they better match the
input. For the active columns we make the synapses
better match the input by incrementing or
decrementing the permanences.
Thus for a given input binary vector we get the

columns of the cells that represent that vector as
output. The size of this output is always xed no
matter whatever the size of the input maybe. This is
the Spatial Pooling aspect of HTM.

Temporal Memory
9 of 15 2/6/17 13:25

the spatial representation of the input. However the
Temporal Memory tells us which cells are active in
the activated columns. The activated cells in the
column tells us the context of an input.
The output of the Spatial Pooler is generally passed

as an input to the Temporal Memory.
When an HTM model receives an input for the rst

time, all the cells in the activated column will be
active. This is called bursting. It happens because
when an HTM model sees an input for the rst time,
it doesnt know the context of that input, so it
activates all the cells in the activated column to
represent that input and then randomly assigns a
cell to represent that input for future reference.
10 of 15 2/6/17 13:25
From the above gure lets assume that in the given

time step, the two cells marked in black are active. In
HTM, each cell has three dierent steps:
1. Inactive: The cell is not selected.
2. Active: The cell is selected by the current input.
3. Predicted: This cell is expected to be active in the

next time step.
When we receive an input and the respective column

gets active, we nd out if there are any predicted
cells in these columns. If there are any predicted
cells, we activate them.
The below gure tells us what context means in the

Temporal Memory.
11 of 15 2/6/17 13:25
In the above gure, the active cells in the B columns

represent that the B is coming after an A. Had the B
been coming after something else other than A,
some other cells in the B columns would have got
active. Similarly, if the previous input to C was
something other than B, some other cells in the C
columns would have got active. Thus the Temporal
Memory represents the contexts of the input. This is
what the Temporal Memory does in HTM.
Classier
The classier takes inputs from the Temporal
Memory and make predictions. Each cell in the HTM
system keeps a lookup table. The lookup table is
nothing but a list of values and their probability of
occurring next.
For example, lets say we are making a one step

ahead prediction and our values are A, B and C.
When for the the rst time the HTM system receives
B, we go back to the lookup table of what comes one
step before B that is A. Each of the active bits of A
has a lookup table and we increment the probability
of the occurrence of B in each of these lookup tables.
Thus the next time when we receive the input A we
12 of 15 2/6/17 13:25

with the predicted output as predicted.Thus the HTM
model learns about temporal sequences of data.
Although the HTM model is based on the neocortical

framework of our neocortex, to date only a small
subset of the neocortexs framework has been
implemented. The HTM model was rst mentioned
by Je Hawkins in his book On Intelligence and it is
being developed by the company Numenta. Although
it is still in its early stages of development we can
use HTM to solve prediction problems and detect
anomalies.
Sources & References:

1. The HTM Whitepaper by Numenta
2. Je Hawkins speech on The Principles of

Hierarchical Temporal Memory
3. Explanation of the HTM algorithm
4. HTM School
If you are interested in knowing more about

Hierarchical Temporal memory and its
implementations, subscribe to The A.I. Blog with your
email address below.
First Name
13 of 15 2/6/17 13:25
Sign up
Share this:
1 Comment kaustabpal.com
1 Login
Sort by Best
Recommend Share
Join the discussion
A.T. Murray 9 hours ago

Thank you. It was an interesting read. I used to think that "SDR" was an economics term
for "special drawing rights" on the World Bank, but now from your blog-post above I
know that "SDR" can mean "Sparse Distributed Representation" in HTM theory -- which
is somewhat distantly and conceptually related to my Top-Down AI Theory. -Arthur
Reply Share
RECENT POSTS PAGES GET THE LATEST
14 of 15 2/6/17 13:25
Algorithm
START HERE BLOG ABOUT CONTACT First Name

BLOG
IS SYMBIOSIS THE ANSWER
ABOUT
TO THE A.I. THREAT? Your Email
CONTACT
WHAT IS THE TURING TEST?
THE PHILOSOPHY OF
ARTIFICIAL INTELLIGENCE
2017 Kaustab Pal.
15 of 15 2/6/17 13:25

HTM For Dummies

Caricato da

Informazioni sul documento

Titolo originale

Copyright

Formati disponibili

Condividi questo documento

Condividi o incorpora il documento

Opzioni di condivisione

Hai trovato utile questo documento?

Questo contenuto è inappropriato?

Copyright:

Formati disponibili

HTM For Dummies

Caricato da

Copyright:

Formati disponibili

Hierarchical Temporal Memory: Theory behind the Algorithm http://kaustabpal.

up with only a few algorithms that can

in the rst place. Hierarchical Temporal

WHAT IS THE TURING

The neocortex is the wrinkled portion on the top of

START HERE BLOG ABOUT CONTACT

The framework of HTM has three prominent

1. Hierarchy: The Neocortex is divided into

2. Temporal: The input and output of the

3. Memory: The neocortex is a memory based

START HERE BLOG ABOUT CONTACT

The inputs in our Neocortex are data from all our

Sparse Distributed representations:

It was discovered that the neocortex uses a common

START HERE BLOG ABOUT CONTACT

A typical HTM model consists of a sensory region, a

The sensory region takes inputs from the sensors

START HERE BLOG ABOUT CONTACT

When an input is received each column of the HTM is

START HERE BLOG ABOUT CONTACT

In the above representation, we can see that the rst

Each connection between the column and the input

For each column we calculate the number of

START HERE BLOG ABOUT CONTACT

In the above example for the given input vector

To select the columns that will represent a given

1. Compute the overlap score of all the columns.

2. Since we want a sparse distributed

3. After we select the top two percent of the

4. For each of the active columns we

START HERE BLOG ABOUT CONTACT

When we turn learning on in the column, we want to

Thus for a given input binary vector we get the

START HERE BLOG ABOUT CONTACT

The output of the Spatial Pooler is generally passed

When an HTM model receives an input for the rst

START HERE BLOG ABOUT CONTACT

From the above gure lets assume that in the given

1. Inactive: The cell is not selected.

2. Active: The cell is selected by the current input.

3. Predicted: This cell is expected to be active in the

When we receive an input and the respective column

The below gure tells us what context means in the

START HERE BLOG ABOUT CONTACT

In the above gure, the active cells in the B columns

For example, lets say we are making a one step

START HERE BLOG ABOUT CONTACT

Although the HTM model is based on the neocortical

Sources & References:

2. Je Hawkins speech on The Principles of

3. Explanation of the HTM algorithm

If you are interested in knowing more about

START HERE BLOG ABOUT CONTACT

Join the discussion

A.T. Murray 9 hours ago

RECENT POSTS PAGES GET THE LATEST

2017 Kaustab Pal.

Potrebbero piacerti anche