Novel Shannon Graph Entropy, Capacity: Spectral Graph Theory

Novel Shannon Graph Entropy, Capacity: Spectral Graph Theory
by
Garimella Ramamurthy
Report No: IIIT/TR/2015/-1
Centre for Security, Theory and Algorithms

International Institute of Information Technology
Hyderabad - 500 032, INDIA
October 2015
NOVEL SHANNON GRAPH ENTROPY, CAPACITY:
SPECTRAL GRAPH THEORY
G. Rama Murthy
International Institute of Information Technology
Hyderabad, India
rammurthy@iiit.ac.in
ABSTRACT
In this research paper, an interesting probability mass function is associated with the vertices of
graph. One among the various possible entropies ( Shannon, Renyii, Tsallis etc ) is associated
with such a probability mass function. Specifically Shannon entropy is utilized to define a
novel graph entropy. Characterization of minimum and maximum Shannon entropy graphs is
discussed. By associating a symmetric stochastic matrix with the graph, novel Shannon Capacity
of a graph is defined. Several interesting results in spectral graph theory of structured graphs
are reported. New results related to sparsest and densest cut computation are reported ( without
invoking Cheeger‟s inequality ).
This research paper is organized in the
I. INTRODUCTION:
following manner. In Section 2, relevant
Directed / undirected, weighted / unweighted research literature is reviewed. In Section 3, a
graphs naturally arise in various applications. novel graph entropy is defined by associating
Such graphs are associated with matrices such a graph with an interesting and natural
as weight matrix, incidence matrix, adjacency probability mass function. Its properties are
matrix, Laplacian etc. Such matrices implicitly studied. In Section 4, several interesting results
specify the number of vertices / edges, adjacency related to spectral graph theory of structured
information of vertices graphs are discussed. In Section 5, new results
( with edge connectivity ) and other related related to sparsest / densest cut computation
information ( such as edge weights ). In recent are discussed. In Section 5, some interesting
years, there is explosive interest in capturing applications of such graph entropy are briefly
networks arising in applications such as social specified. The research paper concludes in
networks, transportation networks, bio- Section 7.
informatics related networks ( e.g. gene
regulatory networks ) using suitable graphs. 2. REVIEW OF RESEARCH
Thus, NETWORK SCIENCE led to important LITERATURE:
problems such as community extraction,
frequent sub-graph mining etc. In research literature, there are already
In research efforts related to large scale efforts to associate a probability mass function
network science, various interesting scalar with a graph and compute one among various
valued measures ( performance related ) are possible entropies ( such as Renyii entropy,
defined and utilized for making interesting Tsallis entropy etc ) [9,Simonyi]. Such efforts
inferences on large scale graphs. Specifically, resulted in interesting insights into network
graphs are associated with a relevant science. The essential utility of those
probability mass function and related measures definitions of graph entropy relies on how
are computed. This research paper is an effort well the associated probability mass function
in that direction. captures the uncertainity related to the graph
structure ( relevant to the application of Definition: Vertex Degree Probability Mass
interest ). Function is a probability mass
The author innovated the concept of function, where
vertex degree probability mass function by
normalizing the associated vertex degree
distribution by twice the total number of
number of edges ( or equivalently sum of all
the vertex degrees ). Then he naturally With such a probability mass function, one of
proposed one among various possible graph the various possible entropies ( i.e. definitions
entropies by computing the Shannon entropy such as Shannon, Tsallis, Renyii etc ) can be
of such a Probability Mass Function ( PMF ). associated and utilized as the definition of
It naturally follows that such a PMF can be corresponding graph entropy. Specifically, we
utilized to compute other entropies such as have the following definition.
Renyii Entropy (leading to other definitions of
graph entropy ). It is expected that the Definition: Shannon entropy of an undirected
selected probability mass function captures graph with Vertex Degree Probability Mass
interesting uncertainity associated with the Function is given by
structure of graph. Efforts are underway to
explore interesting ramifications of graph
entropy proposed by the author. In the
following sections, we study various properties
of graph entropy proposed by the author. Remark 1: By virtue of the above definition,
the novel graph entropy is endowed with all
3. SHANNON GRAPH ENTROPY, GRAPH the properties of Shannon entropy.
CAPACITY : PROPERTIES: Specifically, all the axioms satisfied by
Shannon entropy function are satisfied by
We begin the discussion with a definition such graph entropy.
Definition: Vertex degree distribution of an Remark 2: In literature, other graph entropies

undirected graph G = ( V, E ) is the are shown to satisfy properties such as the
distribution of edges of the graph at various SUB ADDITIVITY property. All such known
vertices i.e. properties of Shannon entropy are satisfied by
the Shannon Graph entropy.
the number of edges of the graph that In the following discussion, the
are incident at vertex „i‟ for . term graph entropy means Shannon Graph
It is easy to see that Entropy.
The following interesting
Lemmas specify the maximum and minimum
entropy graphs ( i.e. graphs which will have
maximum and minimum Shannon graph
entropy ).
We conceive the innovative idea of
normalizing Lemma 1: An undirected graph has maximum
and arriving at a probability mass Shannon entropy if and only if the vertex
function. Formally, we have the following degree of all the vertices is equal.
definition
Proof:
From the basic properties of Shannon ( 1, 0, 0,…0 ) upto permutation.
entropy, we necessarily have that the entropy
will assume maximum possible value when But, trivially such a vertex degree probability
the associated probability mass function is mass function corresponds to a disconnected
graph. Thus, the next best minimum entropy
graph is attained, when one of the nodes has
degree (N -1 ) and all other nodes have degree
Thus, by the definition of vertex degree „one‟ i.e. Such a graph corresponds to a Hub-
probability mass function, we require that and-Spoke graph. Thus, the vertex degree
probability mass function of such a Hub-and-
spoke graph is given by
This will happen if and only if
Q.E.D.
Thus, the maximum Shannon graph entropy We now compute the Shannon graph entropy ,
value associated with such a probability mass H(G) of a minimum entropy graph ( Hub-and-
function is given by spoke graph ):
Q.E.D.
H(G) =
For instance, ring connected graph and clique
( fully connected graph ) are examples of = 1+(N–1)+ (N-1)
undirected graphs having the maximum
Shannon entropy. We now discuss, when = N+ (N-1)
maximum entropy graphs exist on „N‟
vertices. Novel Shannon Capacity of a Graph:
Now we consider the characterization In the following discussion, we

of minimum entropy graphs. associate an undirected, unweighted
graph with a stochastic matrix ( that
Lemma 2: An undirected graph has minimum could be considered as the channel
Shannon entropy if and only if it is a matrix of a discrete memoryless
Hub-and-Spoke graph i.e. the vertex degree channel ).
probability mass function is given by Let A be
the adjacency matrix of such as graph
and let D be the associated diagonal
matrix whose diagonal elements are the
vertex degrees. It is immediate that
upto relabeling of the vertices.
is a symmetric stochastic matrix. It
Proof: It is well known that the Shannon could be considered as the channel
entropy of a probability mass function is matrix of a Discrete Memoryless
minimum when it is degenerate i.e it is given Channel.
by
For such a Discrete Memoryless the graphs to be a maximum /
Channel, we can consider the Vertex minimum entropy graph.
Degree Probability Mass Function of the
graph ( defined earlier ) as the Input In the case of trees ( graphs without a
probability mass function and compute cycle ), one of the graphs can be chosen
the Channel Capacity. We defined such to be a line / bus connected and the
a quantity as a novel Shannon K-L divergence , Jeffrey divergence can
Capacity of graph. It should be noted be computed.
that Lovasz defined and studied other
type of Shannon capacity of graph Remark:
[4, Lov]. Shannon entropy, capacity of
unweighted / weighted directed graphs are
From the above discussion, it should defined based on the above approach,
be clear that the above considered Properties of such graph measures are studied
novel Shannon Capacity can also be [Rama3].
defined for weighted graphs ( with all
the weights being nonnegative ). In 4. SPECTRAL GRAPH THEORY: NOVEL
this case, there are two possible GRAPH ENTROPY:
stochastic matrices: one based on the Spectral graph theory
adjacency matrix and the other based deals with the study of properties of a
on the non-negative weight matrix. graph in relationship to the characteristic
polynomial, eigenvalues and eigenvectors of
Also, utilizing the vertex degree matrices associated with the graph, such as
probability mass functions of two its adjacency matrix or Laplacian matrix.
graphs, distance between them could We prove new results related to spectral
be defined and studied in the following graph theory of structured graphs.
manner. An undirected graph has a
symmetric adjacency matrix, A and hence all
Distance between Graphs: K-L its eigenvalues are real. Furthermore, the
Divergence: eigenvectors are orthonormal. We have the
Consider two graphs following definition
whose vertex degree probability mass function
are denoted by One possible Definition: An undirected graph‟s
distance measure between them is given by SPECTRUM is the multiset of real
the associated Kullback-Leibler distance D ( P || eigenvalues of its adjacency matrix, A
Q ) between them ( also known as relative
entropy or cross entropy ). It is given by Fact: While the adjacency matrix depends on
the vertex labeling, its spectrum is a graph
D( P || Q ) = . invariant.
We provide an interesting proof of the above

Since D( P || Q ) is not symmetric, we fact.
utilize the Jeffrey divergence between We need the following well known theorem.
them as a related distance measure.
Rayleigh’s Theorem: The local optima
An interesting distance between graphs of the quadratic form associated with a
can be computed by choosing one of
symmetric matrix A ( i.e. on entropy graph is d. The associated
the unit Euclidean hypersphere eigenvectors is the
( i.e. { X : occur at the “all ones” vector. In the case of
eigenvectors with the corresponding value clique, „d‟ assumes the maximum
of the quadratic form being the possible value of „N-1‟ if no self
eigenvalue. loops are allowed at each of the
vertices. If self loops are allowed at
Lemma 3: Eigenvalues of the adjacency matrix the vertices, the value of „d‟ ( for a
of an undirected graph, A are invariant under clique ) assumes the maximum
relabeling of the vertices: possible value of „N‟. Also, in the
case of Ring connected graph (
Proof: By Rayleigh‟s theorm, eigenvalues of which has maximum entropy ), the
A are the local optimum of the associated value of d is 2.
quadratic form evaluated on the unit
hypersphere. Thus, we need to reason that the Suppose we consider a maximum
quadratic form remains invariant under entropy graph ( in which self loops
relabeling of the vertices. We have that are not allowed at all the vertices ).
The trace of the associated adjacency
matrix is ZERO.
Thus, the sum of all the eigenvalues
other than the spectral radius is „-d‟.
= + Such an adjacency matris is
….. + indefinite.
,
It is easy to show that the adjacency
matrices of all maximum entropy
where, for instance, { are the graphs are RIGHT CIRCULANT
vertices connected to the vertex 1 ( one ) ( and matrices. From linear algebra
similarly other vertices ). research literature, there are well
Now, from the above known results on eigenvalues,
expression, it is clear that the quadratic form eigenvectors of right circulant
remains invariant under relabeling of the matrices. Those results are invoked
vertices. Specifically, relabeling just reorders the in understanding the spectrum of
expressions.Thus, the eigenvalues of A remain maximum entropy graphs.
invariant under relabeling of vertices Q. E..D
By Rayleigh‟s Theorem, largest
We now consider the spectrum of eigenvalue of the symmetric adjacency
maximum entropy graphs. matrix is the maximum value of the
quadratic form associated with A
MAXIMUM ENTROPY GRAPHS: evaluated on the unit hypersphere. (
could be labeled as “energy” ).
By Lemma 1, the sum of all the rows
of adjacency matrix is a constant, say Line / Bus Connected Graphs:
„d‟. Thus, using the well known
results on spectral radius of a non- NOTE: It is clear that a ring
negative adjacency matrix, the connected graph has maximum
maximum eigenvalue of a maximum
Shannon graph entropy. A practical
example could be ring connected
molecules forming a chemical
compound. Suppose, we are interested Thus, we have the following:
in a graph with Shannon entropy
being as large as possible, but the Case I : N is odd. Recursively computing the
graph doesnot have a cycle in it. determinants, we have that
Then, the closest approximation is a
LINE connected graph. It is Case II : N is even . Recursively computing
immediate that the adjacency matrix the determinants, we have that
of a LINE/BUS connected graph on
N vertices is of the following form:
Thus, in this case the determinant value

oscillates between values +1 and -1 beginning
with -1.
Claim: Tree connected graph that has Hence, when N is even the adjacency matrix
maximum Shannon entropy is a line / bus A is unimodular
connected graph. Q.E.D.
Goal: To compute the determinant of We briefly consider the eigenvalues of line /

„N x N‟ adjacency matrix of a LINE / BUS bus connected graphs.
connected graph when the dimension N is
odd / even. For N=2, the eigenvalues of line
connected graph are +1, -1. Also ,
Lemma 4: The adjacency matrix of a line when N=3, the eigenvalues are 0,
connected graph, A is UNIMODULAR when .
the dimension, N is even ( i.e. Det (A) =
and is singular ( i.e. Det(A) = 0 ) when There is an interesting recursive
the dimension, N is odd. relationship between the characteristic
polynomial of line connected graphs.
Proof: Let denote the adjacency matrix Let CP( denote the characteristic
of a line connected graph on „j‟ vertices. polynomial of line connected graph on
„j‟ vertices. The following result holds
From the definition of determinant of
A, it is clear that when N is even as
well as odd
.
MINIMUM ENTROPY GRAPHS:
Also, as the initial condition, we have
Now, we consider minimum entropy
graphs and study the spectrum of such
graphs. From Lemma 2, we know that it
must be a Hub-and-Spoke graph. The
adjacency matrix A of such a graph is of
the following form:
Thus, the eigenvalues are 0, ,- .

Hence the result holds true for the case
A= = , N =3.
Now, the induction hypothesis for
the case where the dimension of A is (N-1)
Where is an ( N-1 ) dimensional specifies that the characteristic polynomial of
A is given by
column vector of all ones. The following
results can easily be verified .
We now show that the above induction
hypothesis, gives the result for the case where
When N =2, the eigenvalues of A are
the dimension of A is N.
+1,-1
Trace ( A ) = 0 . Thus A is indefinite
matrix
Determinant (A) = 0 when N > 2.
Now, we have the following goal.

Using the induction hypothesis and the
Goal: To determine all the eigenvalues, structure of the matrix on the RHS in the
eigenvectors of the adjacency matrix A above expression, we have that
associated with a Hub-and-spoke graph
when N ( the number of vertices ) is -
strictly greater than 2. .
Lemma 5: The eigenvalues of the graph The above result holds for the case when N
associated with a minimum entropy graph is even as well as odd. Thus, the
on N vertices are characteristic polynomial of A is given by
the .
Hence, the eigenvalues of A are given by
multiplicity of the eigenvalue at 0 is (N-
2). the
Equivalently, the characteristic polynomial
of A is given by multiplicity of the eigenvalue at 0 is (N-2).
Q.E.D.
Proof: We prove the result by
mathematical induction. Consider the Characterization of Null Space of A :
case where N = 3.
Let N be odd. Consider the vectors in the only vector in the vector in the
which the components out of null space of A that can also be stable
state is the “all-ones vector” or „all-
(N-1) components are +1 and the other minu-ones vector”.
components are -1. Also, let the
last component be ZERO. All such Consider a corner of hypercube, with
vectors are in the null space of the last component being +1. Then, we
adjacency matrix of minimum Shannon have that the first ( N-1) components
entropy graph when N is odd. of A are all equal to „one‟. Thus,
cannnot be an eigenvector since, the
Now let N be even. Any one last component of A will be equal to
component among the first (N-1) ( N -1 ). Similarly, when the last
components is zero. component of is -1, we reason that
Also, consider the vectors in which the cannot be an eigenvector.
components out of (N-2)
Claim: None of the corners of
components are +1 and the other hypercube can be an eigenvector of A.
components are -1. Further, let the last
component be ZERO. All such vectors 5. SPARSEST CUT COMPUTATION in
are in the null space of adjacency DIRECTED / UNDIRECTED
matrix of minimum Shannon entropy UNWEIGHTED GRAPHS : SPECTRAL
graph when N is even GRAPH THEORY:
Eigenvectors of A corresponding to Traditionally, Cheeger’s inequality is

Non-zero Eigenvalues: utilized to approximate the sparsest
By the definition of cut of a graph through the second
eigenvector, it can be shown that the eigenvalue of its Laplacian. It is
column vector whose first (N-1) considered to be the most important
components are +1 and the last Theorem in spectral graph Theory and
component is is the one of the most useful facts in
algorithmic applications.
eigenvector corresponding to eigenvalue
We now
. In the case of eigenvalue provide a different approach to the
the last component of problem by invoking the known results
eigenvector is . on minimum cut computation in
directed / undirected graphs associated
Eigenvectors of A: Corners of Unit with a Hopfield neural network. In the
Hypercube: following we provide a review of the
Suppose a column vector relevant literature for the convenience
is in the null space of A. Then we of the reader [Rama1].
have that
REVIEW OF RESEARCH LITERATURE: MIN /
Thus, MAX CUT COMPUTATION : HOPFIELD
NEURAL NETWORK
Depending on whether Sign ( 0 ) = +1
Contribution of Hopfield et.al:
or -1,
Serial Mode: The set of nodes at
Hopfield neural network constitutes a which state updation as in (5.1) is
discrete time nonlinear dynamical system. It performed is exactly one i.e. at time
is naturally associated with a weighted „t‟ only at one of the nodes / neurons
undirected graph G = (V,E), where V is the the above state updation is performed.
set of vertices and E is the set of edges. A
weight value is attached to each edge and a Fully Parallel Mode: At time „t‟ , the
threshold value is attached to each vertex/node state updation as in (5.1) is performed
/artificial neuron of the graph. The order of simultaneously at all the nodes
the network is the number of nodes / vertices
in the associated graph. Thus a discrete time In the state space of discrete time Hopfiled
Hopfield neural network of order „N‟ is neural network, there are certain distinguished
uniquely specified by [3, Hop] states, called the STABLE STATES.
(A) N x N Symmetric Synaptic Weight Definition: A state V(t) is called a “Stable

Matrix M i.e. denotes the weight State” if and only if
attached to the edge from node i to node ….(5.2)
j ( node j to node i ) Thus, if state dynamics of the network
reaches the stable state at some time „t‟, it
(B) Nx1 Threshold Vector i.e. denotes the will remain there for ever i.e. no change in
threshold attached to node „i‟. the state of network occurs regardless of the
mode of operation of the network ( i.e. it is a
Each neuron is in one of the two states i.e. fixed point in the state dynamics of discrete
+1 or -1. Thus, the state space of such a non- time Hopfield neural network ).
linear dynamical system is the N-dimensional The following Convergence
unit hypercube. For notational purposes, let Theorem summarizes the dynamics of
denote the state of node / neuron „i‟ at discrete time Hopfield neural network in the
the discrete time index „t‟. Let the state of the serial and parallel modes of operation. It
Hopfield neural network at time „t‟ be characterizes the operation of the neural
denoted by the Nx1 vector The state network as an associative memory.
at node „i‟ is updated in the following
manner ( i.e. computation of next state of Theorem 1: Let the pair Z = ( M,T )
node „i‟ ) specify a Hopfield neural network. Then the
following hold true:
…..(5.1) [1] Hopfield : If N is operating in a serial

mode and the elements of the diagonal of M
are non-negative, the network will always
i.e. the next state at node „i‟ is +1 if the
converge to a stable state
term in the bracket is non-negative and -1
( i.e. there are no cycles in the state space
if it is negative. Depending on the set of
).
nodes at which the state updation given in
equation (5.1) is performed, the neural
[2] Goles: If N is operating in the fully
network operation is classified into the
parallel mode, the network will always
following modes:
converge to a stable state or to a cycle of
length 2 ( i.e the cycles in the state space are Definition: Consider a weighted, undirected
of length ). graph, G with the set of edges E and the set
The proof of above of vertices V i.e. G = ( V, E ). Consider a
Theorem is based on associating the state subset U of V and let The set of
dynamics of Hopfield Neural Network (HNN) edges each of which is incident at a node
with an energy function. It is reasoned that in U and at a node in is called a cut
the energy function is non-decreasing when in G.
the state of the network is updated ( at
successive time instants ). Since the energy Definition: The sum of edge weights of a
function is bounded from above, the energy cut is the weight of the cut. From among all
will converge to some value. The next step possible cuts, the cut with minimum weight
in the proof is to show that constant energy is called a Minimum cut of the graph.
implies a stable state is reached in the first
case and atmost a cycle of length 2 is Note: Thus, min cut specification is equivalent
reached in the second case. to specifying the subset of vertices, on one
The so called side of the cut.
energy function utilized to prove the above
convergence Theorem is the following one: Minimum and Maximum Cut in
Undirected Graphs: Stable / Anti-
.(5.3) Stable States:
Thus, HNN, when operating in the serial In the following Theorem,

mode will always get to a stable state that proved in [1, BrB], the equivalence between
corresponds to a local maximum of the the minimum cut and the computation of
energy function. Hence the Theorem suggests global optimum of energy function of HNN
that Hopfield Associative Memory (HAM) is summarized.
could be utilized as a device for performing
local/global search to compute the maximum Theorem 2: Consider a Hopfield Neural
value of the energy function. Network (HNN)
Z= (M,T) with the thresholds at all nodes
Contribution of Bruck et.al being zero i.e. T 0. The problem of finding
The the global optimum stable state ( for which the
above Theorem implies that all the energy is maximum ) is equivalent to finding
optimization problems which involve a minimum cut in the graph corresponding to
optimization of a quadratic form over the unit Z.
hypercube ( constraint / feasible set ) can be
mapped to a HNN which performs a search Corollary:
for its optimum. One such problem is the Consider a Hopfield neural network
computation of a minimum cut in an Z =(M,T) with the thresholds at all neurons
undirected graph. being zero i.e. T . The problem of finding
a state V for which the energy is global
For the sake of minimum is equal to finding a maximum cut
completeness, we now provide the definition in the graph corresponding to Z.
of a “cut” in a graph
Proof: Follows from the argument as in
[1, BrB]. We repeat the argument for clarity ( We have the following definition associated
required for understanding the algorithms for with the values of quadratic form ( energy
MIN/MAX computation in DIRECTED / function ) at the stable states and anti-stable
UNDIRECTED graphs ). states.
It is proved in [8, RaN], there is no
loss of generality in assuming that the Definition:
threshold at all the nodes/vertices of the Consider a quadratic form
Hopfield neural network is zero. Thus, the optimized on the unit hypercube ( constraint
energy function is a pure quadratic form i.e set ). The local maximum values attained at
the stable states are called the stable values.
. Also, the local minimum values attained at
the anti-stable states are called the anti-stable
Let us denote the sum of weights of edges values.
in the graph with both end points in vertex
set as and let , denote Remark 1 :
the corresponding sums for the other two In view of Theorem 2 , minimum
cases. We readily have that cut computation in an undirected graph
reduces to determining the global optimum
( stable state, S of the associated Hopfield
. neural network N = (M,T) with T . The { +
Since the first term in the above equation is 1 } components of S determine the subset of
constant ( it is the sum of weights of all vertices that are members of one side of the
edges ), it follows that the minimization of H cut ( and { - 1 } components determine the
is equivalent to maximization of . It is vertices on the other side of the cut).
clear that is the weight of the cut in N Similarly, maximum cut computation requires
with being the nodes of the Z with state determination of global optimum anti-stable state.
equal to 1 . Q. E.D.
Minimum and Maximum Cut in
Thus, the operation of Hopfield neural Directed Graphs: Stable / Anti-Stable
network in the serial mode is equivalent to States:
conducting a local search algorithm for
finding a minimum cut in the associated In
graph. State updation at a node in the view of the above Theorem, a natural
network is equivalent to moving it from one question that arises is to see if a Hopfield
side of the cut to the other side in the local neural network can be designed to perform a
search algorithm. local search for minimum cut in a directed
graph. In that effort we now provide formal
As in the case of stable states, we have the definition of minimum cut in a directed
following definition. graph.
Definition: The local minimum vectors of a Definition: Let a weighted, directed graph be
quadratic form on the hypercube are called denoted by G= (V,E). Every edge has a
anti-stable states. This concept w as first weight and direction associated with it. The
introduced in [8, RaN]. weights of directed edges can be represented
by an N x N matrix M in which
represents the weight of the edge from vertex
„i‟ to vertex „j‟. Let a subset of the vertex
set V be denoted by and and let Sparsest Cut Computation in
The set of edges each of Undirected , Unweighted Graphs:
which has its tail at a vertex in and its
head at a vertex in is called a In view of Theorem 2, we have the
DIRECTED CUT OF G. The directed cut following claim.
with minimum total weight is called a
DIRECTED MINIMUM CUT (DMC) in G. Claim : Since the weight matrix of an
The following Theorem is in undirected graph is a { 0, 1 } matrix,
the same spirit of Theorem for directed computation of local / global minimum cut
graphs. computation in the associated graph is
equivalent to computation of local / global
Theorem 3: Let M be the matrix of edge optimum stable state in the associated
weights ( M is not necessarily symmetric ) in Hopfield neural network. Specifically, the
a weighted directed graph denoted by G = global minimum cut computation corresponds
(V,E). The network performs a to determining the cut with smallest number
local search for a DMC of G where of edges i.e. the sparsest cut
Since, the adjacency matrix of the

undirected graph is a non-negative ( {
0, 1} ) matrix, the global optimum
stable state is the all-ones vector, It
corresponds to the “empty cut”.
Thus, computation of sparsest cut in
Proof: Refer [2,BrS]. such a graph is equivalent to
determination of second largest stable
Thus, this Theorem shows that the state.
computation of minimum cut in a directed
graph is equivalent to determining the global From the proof of Theorem 2, the following
optimum stable state of a suitably chosen equation readily follows in the case of an
Hopfield neural network. undirected, unweighted graph. ( with
associated { 0, 1} weight matrix )
Note: Similarly as in Theorem 2, the
computation of directed maximum cut ( i.e. Second Largest Stable Value ( SLSV ) =
maximum cut in a directed graph ) reduces to
computation of global minimum 2 ( Total number of edges ) - 4 ( weight of
anti-stable state ( global minimum vector of Sparsest Cut ).
quadratic energy function ) in a suitably
chosen Hopfield neural network. Equivalently, we have that
Claim 1 : In summary MIN / MAX cut Number of Cut Edges in Sparsest Cut =
computation
in directed / undirected graphs is equivalent ( 2 |E| - SLSV ).
to the determination of global optimum Thus, ideally we would like to compute the
stable / anti-stable state computation of
Second Largest Stable State and the Second
associated Hopfield neural network.
Largest Stable Value (SLSV). We specifically
consider the following case where such a
thing is possible. The proof of above Lemma is based on
projecting points, X on the unit hypercube
CASE I : The Vertex Degree of all vertices is onto the unit hypersphere, Y using the
a constant., say „d‟ i.e. “all ones vector ” is following transformation:
an eigenvector of adjacency matrix A
corresponding to the spectral radius, “d”.
In such a case, the “all ones-column is utilized.

vector”, denoted by .
is the eigenvector Hence, we have that
corresponding to the largest eigenvalue .
of „d‟. It also, happens to be the
Using the Rayleigh‟s Theorem, we have the
largest stable state and corresponds to
following inequality satisfied by the
empty cut ( by Theorem 2 ).
eigenvalues of W
In this case, the second largest stable
.
state corresponds to every vector on
the hypercube with exactly one
Hence, we have that
element being „-1‟. There are N such
.
vectors and all of them. The
corresponding graph corresponds to a
Lemma 7: Let W be a symmetric, non-
“Hub-and-Spoke graph” with any one
negative matrix and be the
of the vertices as the hub. Such a
all-ones column vector. The following
graph leads to sparsest cut with “d” cut
statements are true:
edges
CASE II: The Vertex Degree of all vertices is the eigenvector corresponding to
is NOT a constant i.e. “all ones vector ” is the spectral radius of W
NOT an eigenvector of adjacency matrix A. or
Corner of hypercube can be the
In this case, the following Lemmas proved eigenvector of W corresponding to the
in [Rama ] provides an approach to bound smallest eigenvalue only, if is not an
the Second Largest Stable Value (SLSV). eigenvector
Lemma 6: Let W be a symmetric non- Proof: Refer [6, Rama2].

negative matrix and be the spectral
radius of W. Then we have the following Let
lower bound on i.e.
and let
= 2 ( Total Number of Edges )

Proof: Refer [Rama ].
Case ( ii ) is not an eigenvector. Thus, by Remark: Using Theorem 3, similar results
Rayleigh‟s theorem all the stable values must can easily derived for computation of sparsest
lie between the second smallest eigenvalue and densest cut in directed graphs. Details are
and the smallest eigenvalue. Hence, we have avoided for brevity.
the following bounds on eigenvalues of A:
6. APPLICATIONS of SHANNON GRAPH
ENTROPY:
In network science ( applied to
social network analysis, transportation
networks etc ), untruncated / truncated,
normalized ( neglecting the vertices whose
degree is less than some threshold ) vertex
degree distribution is utilized to arrive at
various scalar measures. Shannon Graph
entropy of associated vertex degree probability
Hence, we have the following lower, upper mass function can be utilized to determine
bounds on the Second Largest Stable Value how “uniform‟/ “non-uniform” is the vertex
(SLSV). By Rayleigh‟s Theorem , we have degree probability mass function. Also, various
that moments associated with the vertex degree
PMF can be computed. Other information
theoretic measures can be associated with the
vertex degree PMF of “pruned graph”
Using the earlier result on relating the SLSV
with the number of edges in sparsest cut 7. CONCLUSIONS :
denoted by R, we have that In this research paper, a
novel probability mass function is associated
. with the vertex degree distribution of an
undirected graph. Shannon entropy of a graph
Densest Cut Computation in is defined to be the entropy associated with
Undirected , Unweighted Graphs: such a probability mass function.
Using the Characterization of undirected graphs with
Corollary of Theorem 2 on computation of minimum and maximum Shannon entropy is
maximum cut, we have the following result: discussed. Results in spectral graph theory of
structured graphs are discussed. New results
Global Minimum Anti -Stable Value ( on sparsest / densest cut computation in
GMASV ) = undirected / directed graphs are discussed.
2 (Total number of edges ) - 4 ( weight of REFERENCES:

Densest Cut ).
[1] [BrB] J. Bruck and M. Blaum,”Neural
Equivalently, we have that Networks, Error Correcting Codes and
Polynomials over the Binary Cube,” IEEE
Number of Cut Edges in Densest Cut = Transactions on Information Theory, Vol.35,
No.5, September 1989.
( 2 |E| - GMASV ).
[2] [BrS] J. Bruck and J. Sanz, “ A Study on
Neural Networks, “ International Journal of
Intelligemt System, Vol. 3, pp. 59-75, 1988.
[3] [Hop] J.J. Hopfield, “Neural Networks and

Physical Systems with Emergent Collective
Computational Abilities,” Proceedings of National
Academy Of Sciences, USA Vol. 79, pp. 2554-2558,
1982
[4, Lov] L.Lovasz,”On the Shannon Capacity of

a Graph, “ IEEE Transactions on Information
Theory, IT, Vol. IT-25, January 1979
[5, Rama1] G. Rama Murthy,”Multi-dimensional

Neural Networks : Unified Theory, “ Research
monograph published by NEW AGE
INTERNATIONAL PUBLISHERS, NEW DELHI,
2007.
[6, Rama2] G. Rama Murthy,”Bounding the

Eigenvalues of a Matrix: Optimization of Quadratic
Forms, “ IIIT Technical report 2015
[7, Rama3] G. Rama Murthy, “Novel Shannon

Entropy, Capacity of a Graph: Applications,” IIIT
Technical Report in Preparation
[8, RaN] G. Rama Murthy and B. Nischal,” Hopfield-

Amari Neural Network : Minimization of Quadratic
forms,” The 6th International Conference on Soft
Computing and Intelligent Systems, Kobe Convention
Center (Kobe Portopia Hotel) November 20-24, 2012,
Kobe, Japan.
[9, Sim] G. Simonyi,”Graph Entropy: A

Survey,” DIMACS Series in Discrete
Mathematics and Theoretical Computer
Science

Novel Shannon Graph Entropy, Capacity: Spectral Graph Theory

Caricato da

Informazioni sul documento

Titolo originale

Copyright

Formati disponibili

Condividi questo documento

Condividi o incorpora il documento

Opzioni di condivisione

Hai trovato utile questo documento?

Questo contenuto è inappropriato?

Copyright:

Formati disponibili

Novel Shannon Graph Entropy, Capacity: Spectral Graph Theory

Caricato da

Copyright:

Formati disponibili

Novel Shannon Graph Entropy, Capacity: Spectral Graph Theory

Report No: IIIT/TR/2015/-1

Centre for Security, Theory and Algorithms

Definition: Vertex degree distribution of an Remark 2: In literature, other graph entropies

This will happen if and only if

Now we consider the characterization In the following discussion, we

We provide an interesting proof of the above

Thus, in this case the determinant value

Goal: To compute the determinant of We briefly consider the eigenvalues of line /

Thus, the eigenvalues are 0, ,- .

Now, we have the following goal.

Eigenvectors of A corresponding to Traditionally, Cheeger’s inequality is

(A) N x N Symmetric Synaptic Weight Definition: A state V(t) is called a “Stable

…..(5.1) [1] Hopfield : If N is operating in a serial

Thus, HNN, when operating in the serial In the following Theorem,

Since, the adjacency matrix of the

In such a case, the “all ones-column is utilized.

Lemma 6: Let W be a symmetric non- Proof: Refer [6, Rama2].

= 2 ( Total Number of Edges )

2 (Total number of edges ) - 4 ( weight of REFERENCES:

[3] [Hop] J.J. Hopfield, “Neural Networks and

[4, Lov] L.Lovasz,”On the Shannon Capacity of

[5, Rama1] G. Rama Murthy,”Multi-dimensional

[6, Rama2] G. Rama Murthy,”Bounding the

[7, Rama3] G. Rama Murthy, “Novel Shannon

[8, RaN] G. Rama Murthy and B. Nischal,” Hopfield-

[9, Sim] G. Simonyi,”Graph Entropy: A

Potrebbero piacerti anche