There has been a long tradition of measuring qualities for network locations from both egocentric and global perspectives. This is largely addressed with quantification attempts in mathematical sociology under the theme of
social network analysis (SNA) (
Wasserman and Faust, 1994;
Knoke and Yang, 2007;
Golbeck, 2013;
Borgatti et al. 2013). There are also several popular software toolkits that perform analysis and visualization of social networks (i.e., sociograms) including UCINET and NodeXL. Tom Snijders’
SIENA is a program for the statistical analysis of network data. The NSF-sponsored visualization project is Traces (
Suthers, 2011), which traces out the movements, confluences, and transformations of people and ideas in online social networks.
The aim of this chapter is to review a selective subset of SNA measures that complement algorithmic descriptions explained in the remainder of this book. For a glossary of SNA terms, readers are recommended to consult
Golbeck (2013).
We will start with egocentric (i.e., node view) measures. A
degree-1 network of a node is the node and its immediate neighbor nodes. A
degree-1.5 network of a node is the node’s
degree-1 network and its links among immediate neighbors (
Golbeck, 2013). A
degree-2 network of a node is the node’s
degree-1 network and all its immediate neighbors’ connections (
Golbeck, 2013). A
degree-n network of a node is the
degree-1 network of the node plus all the nodes and the corresponding links that are no more than
n links away from the starting node.
A path is a chain (i.e., succession) of nodes connected by links between pairs of nodes. Two nodes are connected if and only if (i.e., iff) there is a path between them. A connected component is a set of nodes with connected paths among all pairs of nodes in the set. A bridge is a link that connects two isolated connected components. A hub is a node with many connections. Reachability is whether two nodes are connected or not by way of either a direct or an indirect path of any length.
Geodesic distance, denoted by
distanceij, is the number of links in the shortest possible path from node
i to node
j.
Diameter of a network is the largest geodesic distance in the connected network.
Reverse distance, denoted by RD
ij, is distance
ij - (1 +
Diameter). Metrics in
Equations 2.1 and
2.2 are adapted from
Valente and Foreman (1998):
(k)=?j?kRDjkn-1
(2.1)
(k)=?j?kRDkjn-1
(2.2)
Structural centrality measures of a node are a host of measures reflecting the structural properties of the links surrounding a focal node. For example, degree centrality of a node is the number of edges incident on the node.
Closeness centrality of a node is the average of the shortest path lengths from the node to all other nodes in the network. It is a rather small number in
small-world networks (
Watts and Strogatz, 1998).
Betweenness centrality of a node is a measure of the node’s importance (and possibly influence as discussed in
Chapter 7) and is computed using the algorithm shown in
Figure 2.1.
Fig. 2.1 Betweenness value computation.
Eigenvector centrality measures the centrality of neighbor nodes and has been used as a measure of
influence and
power, which are discussed later in this book (
Bonacich and Lu, 2012). Bonacich developed a beta centrality measure
CBC with a parameter
a used for adjusting the importance of a node’s degree versus a parameter
ß for adjusting the importance of the neighbor’s centrality. This is shown in
Equation 2.3:
BC=?j?N(i)a+[ß×CBC(j)]=a logn(i)+ß×?j?N(i)CBC(j)
(2.3)
Eigenvector centrality of a node at time
t is computed with
Equation 2.4, where
C(
t) is the vector of node centralities,
A is the adjacency matrix, and
At is the result of iterated multiplications of
A:
(t)?=At C(t)?
(2.4)
As time approaches 8, the dominant eigenvalue ? will determine the centrality vector value with the value t×V1?, where 1?, is the eigenvector corresponding to the dominant eigenvalue ? (Chiang, 2012).
Let us consider a
degree-1.5 network of a node and measure the ratio of the actual number of links in that network over the total number of possible links that could exist, which yields a measure called the
local clustering coefficient (
Golbeck, 2013).
Density of a network is the ratio of the actual number of links in that network over the total number of possible links that could exist. Cohesion is the minimum number of edges that has to be removed before the network is disconnected.
Let us consider a cluster that is a subset of nodes
s and each node may count the ratio
r as node.
r is the density of its neighbors in
s versus the total number of its neighbors. In the set
s, the node with the minimum
r value
rmin yields the value called
density of cluster (used in
Chapter 7).
Whereas
centrality is a microlevel measure,
centralization is a macrolevel measure, which measures variance in the distribution of centrality in a network. We show the most generic form of centralization in
Figure 2.2.
Fig. 2.2 Centralization algorithm.
=?j=1ndmax-di(n-2)×(n-1)
(2.5)
=6×number of trianglesnumber of length between two paths
(2.6)
Diversity is a measure of the number of edges in a graph that are disjoint. End vertices of such edges are not adjacent (i.e., disjoint dipoles). Diversity is shown in
Equation 2.7:
=number of disjoint diploes[(n/4)×((n/2)-1)]2
(2.7)
Burt’s
structural holes measure gaps among connected components and as such are another measure of diversity (
Burt, 1995).
2.1. Conclusions and future work
Network analysis focuses on quantification (and statistical analyses) of qualities of relative nodes’ locations as well as entire network properties. SNA has long been a stable tool for mathematical sociology (
Borgatti et al., 2013). An active direction of interest has been intelligence analysis of human networks to understand, predict, and mitigate law enforcement as well as understand geopolitical landscapes. The recent debate over surveillance and monitoring of electronic communication metadata by the National Security Agency (NSA) is indicative of this fervent interest.
A second direction of interest is marketing and branding on social media. The interest is to understand human propensity for influence from network connections. Marketers use these propensities to craft viral dissemination of consumption patterns and manipulation of economic activities. The documentary filmmaker, Morgan Spurlock, has publicly explored branding on social media. His mission is to...