Sei sulla pagina 1di 81

Social Network Analysis (SNA)

Chapter-1
CONTENTS
Social Network perspectives
Fundamental concepts of network analysis
Motivation
Erdos Number Project
Centrality measures

2
Social Network perspectives

Social network analysis has a distinct research


perspective within the social and behavioral
sciences.
Based on an assumption of the importance of
relationships among interacting units.
theories, models, and applications that are
expressed in terms of relational concepts or
processes.
relations defined by linkages among units are a
fundamental component of network theories
3
Social Network perspectives

In addition to the use of relational concepts:


1. Actors and their actions are viewed as
interdependent rather than independent,
autonomous units
2. Relational ties (linkages) between actors
are channels for transfer or "flow" of
resources (either material or nonmaterial)

4
Social Network perspectives cont.…

1. Network models focusing on individuals


view the network structural environment
as providing opportunities for or
constraints on individual action
2. Network models conceptualize structure
(social, economic, political, and so forth)
as lasting patterns of relations among,
actors

5
Social Network perspectives

1. Formal Descriptions
network analysis provides a vocabulary and set
of formal definitions for expressing theoretical
concepts and properties.

2. Model and Theory Evaluation and Testing


 network models may be used to test theories
about relational processes or structures, Such
theories propose specific structural outcomes
which may then be evaluated against observed
network data
6
Summary
The key feature of social network theories or
propositions is that they require concepts,
definitions and processes in which social units
are linked to one another by various relations.
 Both statistical and descriptive uses of network
analysis are distinct from more standard social
science analysis and require concepts and analytic
procedures that are different from traditional
statistics and data analysis,

7
Fundamental Concepts in Network
Analysis
There are several key concepts at the heart
of network analysis that are fundamental to
the discussion of social networks.
These concepts are: actor, relational tie,
dyad, triad, subgroup, group, relation, and
network.

8
Fundamental Concepts in Network
Analysis
Actor:
discrete individual, corporate, or collective social
units.
Examples of actors are people in a group,
departments within a corporation, public service
agencies in a city, or nation-states in the world
system.
onemode networks: most social network
applications focus on collections actors that are all of
the same type (for example, people in a work group)

9
Fundamental Concepts in Network
Analysis
Relational Tie:
Actors are linked to one another by social ties.
Some of the more common examples of ties
employed in network analysis are:
Evaluation of one person by another (for
example expressed friendship, liking, or
respect)
 Transfers of material resources (for example
business transactions, lending or borrowing
things)
10
Fundamental Concepts in Network
Analysis
Association or affiliation (for example jointly
attending a social event, or belonging to the same
social club)
Behavioral interaction (talking together, sending
messages)
Movement between places or statuses (migration,
social or physical mobility)
Physical connection (a road, river, or bridge
connecting two points)
Formal relations (for example authority)
Biological relationship (kinship or descent)
11
Fundamental Concepts in Network
Analysis
Dyad:
 At the most basic level, a linkage or relationship
establishes a tie between two actors.
 A dyad consists of a pair of actors and the
(possible) tie(s) between them
 Dyadic analyses focus on the properties of
pairwise relationships, such as whether ties are
reciprocated or not, or whether specific types of
multiple relationships tend to occur together.
 Example: A conversation between two friends

12
Fundamental Concepts in Network
Analysis
Triad :
 Relationships among larger subsets of actors may
also be studied.
 Many important social network methods and
models focus on the triad;
 a subset of three actors and the (possible) tie(s)
among them.
 Example: Balance theory(a theory of attitude
change,  urge to maintain one's values and
beliefs over time)
13
BALANCE THEORY
A

-VE
+VE

C B
-VE

14
Fundamental Concepts in Network
Analysis
Subgroup :
we can define a subgroup of actors as any
subset of actors, and all ties among them.
Locating and studying subgroups using
specific criteria has been an important
concern in social network analysis.

15
Social Networking Basics: Cohesive Sub-
group
Cohesive Sub-group
A
well-connected group,
C
B
Strong,direct,intense,fre
quent, positive, ties
D
E Clique
Cluster
E.g. A,B ,D and E
F
G

H I

16
clustering is the task of grouping a set of objects in
such a way that objects in the same group (called
a cluster) are more similar (in some sense or another)
to each other than to those in other groups (clusters).

17
18
Fundamental Concepts in Network
Analysis
Group :
 a group is the collection of all actors on
which ties are to be measured
A group, then, consists of a finite set of
actors who for conceptual, theoretical, or
empirical reasons are treated as a finite set of
individuals on which network measurements
are made.

19
Fundamental Concepts in Network
Analysis
Relation :
 The collection of ties of a specific kind
among members of a group is called a
relation
For example, the set of friendships among
pairs of children in a classroom, or the set of
formal diplomatic ties maintained by pairs of
nations in the world, are ties that define
relations.

20
Fundamental Concepts in Network
Analysis
Social Network :
 A social network consists of a finite set or
sets of actors and the relation or relations
defined on them.

21
Summary
These terms provide a core working vocabulary
for discussing social networks and social network
data.
social network analysis not only requires a
specialized vocabulary, but also deals with
conceptual entities that are quite difficult to pursue
using a more traditional statistical and data analytic
framework

22
Motivation
Empirical Motivations
Theoretical Motivations
Mathematical Motivations

23
Motivation-Empirical Motivations
Inventions of term “sociogram”
A sociogram is a visual depiction of the relationships
between a specific group
To discover the underlying relationship between two
persons.
Criterion (what you want to measure): specific type of
social interaction
1. + ve criterion(to choose something that you either
enjoy or would love to participate in with others)
2. - ve criterion(to choose something that you would not
enjoy), resistant in interpersonal relationship
24
Example: positive criterion
1. Which three
classmate would you
most like to go on a
vacation with?

2. Which three classmates do you like the


most?

25
Example: Negative criterion
1. Which three
classmate would you
least enjoy going on
a vacation with?
2. Which three classmates do you like to be
around the least?

3. Which three classmates would you like to be


least stranded around in an island?

26
Motivation:Theoretical Motivations
development of network methods

EXAMPLE:
social group, isolate, popularity, liaison, prestige,
balance, transitivity, clique, subgroup, social
cohesion, social position, social role, reciprocity,
mutuality, exchange, influence, dominance,
conformity

27
Motivation: Mathematical Motivations
social network analysis, researchers found use for
mathematical models
The three major mathematical foundations of
network methods are graph theory, statistical and
probability theory, and algebraic models

28
Erdos Number Project
describes the "collaborative distance" between
mathematician Paul Erdős and another person, as
measured by authorship of mathematical papers

To be assigned an Erdős number, someone must be


a coauthor of a research paper with another person
who has a finite Erdős number. Paul Erdős has an
Erdős number of zero. Anybody else's Erdős
number is k + 1 where k is the lowest Erdős
number of any coauthor.
29
If Alice collaborates with Paul Erdős on one
paper, and with Bob on another, but Bob never
collaborates with Erdős himself, then Alice is
given an Erdős number of 1 and Bob is given
an Erdős number of 2, as he is two steps from
30
Erdős.
Centrality measures
Centrality measures address the question:
"Who is the most important or central person in
this network?“
According to Scott Adams, the power a person
holds in the organization is inversely proportional to
the number of keys on his keyring.
A janitor has keys to every office, and no power.
The CEO does not need a key: people always open
the door for him.

31
Centrality measures
centrality identify the most important vertices
within a graph

EXAMPLE:
identifying the most influential person(s) in a
social network
 key infrastructure nodes in the Internet or urban
networks
 super-spreaders of disease

32
Centrality
• Finding out which is the most central node is
important:
– It could help disseminating information in the
network faster
– It could help stopping epidemics
– It could help protecting the network from
breaking

33
Centrality measures
Degree centrality
Closeness centrality
Betweeness centrality
Eigenvector centrality
PageRank centrality

34
Centrality: visually
• Centrality can have various
meanings:
Y

X X

X Y
X
Y Y

indegree outdegree betweenness closeness

35
Centrality measures : Degree centrality
Historically first and conceptually simplest

Definition:
Degree centrality refers to the number of ties
a node has to other nodes.
Actors who have more ties may have
multiple alternative ways and resources to
reach goals—and thus be relatively
advantaged.
36
Centrality measures : Degree centrality
Un-Directed Graph:

Degree centrality for an undirected graph is


straightforward—if A is connected to B,
then B is by definition connected to A.

37
Centrality measures : Degree centrality
Directed Graph:
In the case of a directed network (where ties
have direction), we usually define two
separate measures of degree centrality,
namely indegree and outdegree.

38
Centrality measures : Degree centrality
1. in-degree centrality:

 An actor who receives many ties, they are


characterized as prominent.
 The basic idea is that many actors seek to
direct ties to them—and so this may be
regarded as a measure of importance.

39
Centrality measures : Degree centrality

2. out-degree centrality:
 Actors who have high out-degree centrality
may be relatively able to exchange with
others, or disperse information quickly to
many others.
 So actors with high out-degree centrality are
often characterized as influential.

40
DANA
ANN
A
FRANK
CARA

BEN
EVAN

ANNA:2 RANKING
BEN:1 CARA,DANA,EVAN
CARA:3 ANNA,FRANK
DANA:3 BEN
EVAN:3
FRANK:2
41
EXAMPLE: Freeman's approach

42
EXAMPLE: Freeman's approach

43
EXAMPLE
Consider the network ,Which nodes (actors) are
more “central” than others?
2, 5, and 7 appear relatively “central”.
So, node 7 has an in-degree centrality absolute value
of 9 (there are 9 other nodes connected to node 7).
The normalized value is 100 (all possible other
nodes are connected to node 7).
 The out-degree centrality has an absolute value of 3
(node 7 is connected out to nodes 2, 4, and 5), and a
normalized value of 33.33 (3 nodes is 33.33% of the
possible 9 nodes to which node 7 could extend out.)
44
45
46
2 1 3
1

2 4
3

NETWORK A NETWORK B

47
Degree centrality
Network A
Node 1 - centrality score  3 
Node 2 - centrality score 1 
Node 3 - centrality score  1 
Node 4 - centrality score  1
The maximum score is 3.
The degree centrality score of Network A is 1.

48
Degree centrality
Network B
Node 1 - centrality score  3 
Node 2 - centrality score  3 
Node 3 - centrality score  3 
Node 4 - centrality score  3
The maximum score is 3.
The degree centrality score of Network B is 0.
Thus, Network A is more centralized than Network B
for degree centrality. 

49
Centrality measures : Closeness centrality
 Degree doesn't factor in distance.
 Refers to Number of links on Path between nodes
 Path: set of links connecting two people(not unique)
 Shortest Path: Path between two nodes with shortest
distance(not unique)
 Diameter: longest of the shortest paths considering all
nodes.
 Closeness centrality for node:
 Find the shortest path lengths to other
 Take the average of these

Closeness of centrality=

50
(B,A,C, D,F)

ANN
DANA (B,A,C,E,D,
A F)
FRANK
(B,A CARA
)

BEN (B,A,C,D,E,
EVAN F)

(B,A,C, E,F)
Diameter: 5

51
CLOSENESS CENTRALITY D
A
C: F
C
C-B=(C,A,B)=2
C-D=(C,D)=1
C-E=(C,E)=1 B
E
C-F=(C,D,F)=2
C-A=(C,A)=1

 
AVERAGE C: =

Closeness (C) = ==0.714

52
CLOSENESS CENTRALITY D
A
D: F
C
D-B=(D,C,A,B)=3
D-C=(D,C)=1
D-E=(D,E)=1 B
E
D-F=(D,F)=1
D-A=(D,C,A)=2

 
AVERAGE D: =

Closeness (D) = ==0.625

53
0.55 0.625
6
0.45
0.71 5
4

0.38
5 0.625

Nodes closeness value Ranking


A 0.556 3rd
B 0.385 5th
C 0.714 1st
D 0.625 2nd
E 0.625 2nd
F 0.455 4th

54
CLOSENESS CENTRALITY

Ranking: Cara, Dana/Evan, Anna, Frank, Ben


Clearly more reasonable from degree of centrality
 Cara and Anna promoted
Dana and Evan doesn’t hold the graph
 Why should Anna be less important ?
Cara should be even more central than Dana and
Evan
 Most Vital in the connectivity

55
Centrality measures : Closeness centrality
Closeness is a measure of the degree to
which an individual is near all other
individuals in a network.
 It is the inverse of the sum of the shortest
distances between each node and every
other node in the network.
Closeness is the reciprocal of farness

56
Centrality measures : Betweenness
Betweenness is a centrality measure of a vertex within a
graph.
Betweenness centrality quantifies the number of times a node
acts as a bridge along the shortest path between two other
nodes.
It was introduced as a measure for quantifying the control of
a human on the communication between other humans in a
social network by Linton Freeman
A node with high betweenness centrality has a large
influence on the transfer of items through the network, under
the assumption that item transfer follows the shortest paths.

57
Centrality measures : Betweenness
 The betweenness of of a vertex , in a graph
G=(V,E) with vertices V is computed is computed
as follows:
1. For each pair of vertices (s,t), compute the
shortest paths between them.
2. For each pair of vertices (s,t) ,determine the
fraction of shortest paths that pass through the
vertex in question.(here vertex )
3. Sum this fraction over all pairs of vertices (s,t)

58
Betweenness of vertex
 

()=

59
Centrality measures : Betweenness
Cara:
 For each pair, consider two questions?
1. How many shortest paths are there between the pair of people?
2. How many of these shortest paths contain Cara?
A to B=0/1,A to D=1/1=1,A to E= 1/1=1,
A to F= 2/2=1
B to D = 1/1, B to E = 1/1, B to F =2/2=1
D to E =0/1=0, D to F=0/1=0,
E to F =0/1 =0
Betweeness ( C ) = 0+1+1+1+1+1+1+0+0+0=6

60
Centrality measures : Betweenness
Dana:
A to B=A to C=,A to E= 0/1=0
A to F= 1/2=0.5
B to C= B to E = 0/1=0,
B to F=1/2=0.5,
C to E=0/1=0, C to F =1/2 =0.5
E to F = 0/1=0
Betweeness ( D ) =1.5

61
1.5
4
0
6

0
1.5

Nodes Betweeness value Ranking


A 4 2nd
B 0 4th
C 6 1st
D 1.5 3rd
E 1.5 3rd
F 0 4th

62
Centrality measures : Eigenvector
Centrality
A natural extension of degree centrality is eigenvector centrality.
In-degree centrality awards one centrality point for every link a
node receives.
But not all vertices are equivalent:
 some are more relevant than others, and, reasonably, endorsements
from important nodes count more
Eigenvector centrality differs from in-degree centrality: a node
receiving many links does not necessarily have a high eigenvector
centrality (it might be that all linkers have low or null eigenvector
centrality).
Moreover, a node with high eigenvector centrality is not necessarily
highly linked (the node might have few but important linkers).

63
64
Eigenvector Centrality

Having more friends does not by itself guarantee that


someone is more important, but having more
important friends provides a stronger signal
Eigenvector centrality tries to generalize degree
centrality by incorporating the importance of the
neighbors (undirected)
For directed graphs, we can use incoming or outgoing
edges

Ce(vi): the eigenvector centrality of


node vi

: some fixed constant


Eigenvector centrality is a measure of the influence of
a node in a network. It assigns relative scores to all the
nodes in the network based on the concept that
connections to high scoring nodes contribute more to
the score of the node in the question than equal
connections to low scoring nodes.
It is determined by a performing a matrix calculation
to determine what is called , the principle eigenvector
using adjacency matrix.

66
Using the adjacency matrix to find eigenvector
centrality

For
   a given graph,with number of vertices
() be the adjacency matrix, i.e. (if there is a linked
between and ) otherwise
 The relative centrality score of vertex   can be defined
as:

Where is the set of neighbors of λ is a constant.

67
Eigenvector Centrality, cont.

Let T

This means that Ce is an eigenvector of adjacency


matrix A and  is the corresponding eigenvalue

Which eigenvalue-eigenvector pair should we


choose?
Consider the graph below and its 5x5 adjacency matrix, A.

And then consider, x, a 5x1 vector of values, one for each


vertex in the graph. In this case, we've used the degree
centrality of each vertex.

69
Now let's look at what happens when we multiply the
vector x by the matrix A. The result, of course, is another
5x1 vector.

If we look closely at the first element of the resulting


vector we see that the 1s in the A matrix "pick up" the
values of each vertex to which the first vertex is connected
(in this case, the second, third, and fourth) and the
resulting value is the sum of the values each of these
70
vertices had.
In other words, what multiplication by the adjacency
matrix does, is reassign each vertex the sum of the
values of its neighbor vertices.

71
This has, in effect, "spread out" the degree centrality. That this is
moving in the direction of a reasonable metric for centrality can be
seen better if we rearrange the graph a little bit:

72
Suppose we multiplied the resulting vector by A again.
we'd be allowing this centrality value to once again
"spread" across the edges of the graph.
the spread is in both directions (vertices both give to and
get from their neighbors)
might eventually reach an equilibrium when the amount
coming into a given vertex would be in balance with the
amount going out to its neighbors.
the numbers would keep getting bigger, but we could
reach a point where the share of the total at each node
would remain stable
73
At that point we might imagine that all of the
"centrality-ness" of the graph had equilibrated and the
value of each node completely captured the centrality
of all of its neighbors, all the way out to the edges of
the graph.

74
Eigenvector Centrality: Example

 = (2.68, -1.74, -1.27, 0.33, 0.00)


Eigenvalues Vector

max = 2.68
Centrality measure: Page Rank
PageRank is a variant of EigenCentrality.
A potential problem with Katz centrality is the
following: if a node with high centrality links many
others then all those others get high centrality. In many
cases, however, it means less if a node is only one
among many to be linked.
Like Eigen Centrality, PageRank can help uncover
influential or important nodes whose reach extends
beyond just their direct connections

76
Centrality measure: Page Rank
PageRank is an adjustment of Katz centrality that takes
into consideration this issue. There are three distinct
factors that determine the PageRank of a node:
 (i) the number of links it receives,
(ii) the link propensity of the linkers, and
 (iii) the centrality of the linkers
PageRank computes a ranking of the
nodes in the graph G based on the
structure of the incoming links.

77
Centrality measure: Page Rank
PageRank is an algorithm used by Google Search to
rank websites in their search engine results.
“PageRank works by counting the number and
quality of links to a page to determine a rough
estimate of how important the website is. The
underlying assumption is that more important
websites are likely to receive more links from
other websites”

78
79
Centrality measure: Page Rank
The main difference to EigenCentrality:
 is that PageRank takes direction and weight into
account
 Understanding citations (e.g. patent citations,
academic citations)
 Visualizing network activity / propagation of
malware
 Modeling the impact of SEO and link building
activity (although PageRank is now just one of
many ranking algorithms used by Google)

80
Centrality measure: Page Rank
PageRank is a link analysis algorithm and it assigns a
numerical weighting to each element of a hyperlinked
set of documents, such as the World Wide Web, with
the purpose of "measuring" its relative importance
within the set.

81

Potrebbero piacerti anche