5Network.01.Intro
5Network.01.Intro
AGENDA
I. Intro
II. Histor
III. Graph Theory
IV. Centralit
V. Statistical characterisatio
VI. Random vs Scale Fre
VII. Modern Network
VIII. Barabasi Model
y
What is a network?
Network or graph is a
set of vertices or nodes joined by edges
abstract representatio
very genera
convenient to describ
many different systems
l
http://www.technologyreview.com/Energy/12474/page2/
Sexual / Romantic partners network
Bearman, Moody, Stovel. Chains of Affection: The Structure of Adolescent Romantic and Sexual Networks. AJS, 2004 Jefferson High, Columbus, Ohio
Metabolic network of E. Coli
Organisation chart
Some examples
Vertices Edges
Router Cable
Internet
AS Commercial agreements
Interdisciplinary science
Science of complex networks
-graph theor
-sociolog
-communication scienc
-biolog
-physic
-computer science
y
Interdisciplinary science
❖ Empiric
❖ Characterizatio
❖ Modelin
❖ Dynamical processes
s
History
Network Analysis
Network Analysis
❖ Multi disciplinar
❖ social network
❖ political network
❖ electrical network
❖ transportation network
❖ biological network
s
Network Analysis
Network Analysis
❖ In this strategy, you can use your knowledge about the role and meaning of
nodes and relations
❖ Use strategic thinking, this kind of knowledge is commonly used without
making a network analysis, but then people often rely only on intuition.
❖ The network analysis makes is possible to work more systematically.
❖ Examine which actors (groups, organizations, institutions and persons) play,
or might play a role in a given context
Network Analysis
❖ By systematically analysing a network, you can (i.e.)
- identify collaborative/antagonist schema
- obtain more ef cient ow of information
- optimize a strategy for attending your goal
fl
t
Network Analysis
❖ set the goa
❖ identify central actors and brainstorm about the
❖ selection of actors
❖ signi cance, metrics, topologynetwork mappin
❖ analysi
❖ Identify and de ne: problems, goals, constraint
❖ build a strategy
fi
s
fi
:
Network Analysis
fi
Social Network Analysis
❖ Theodore Newcomb found that as Bennington college women were exposed to the
relatively liberal referent group of fellow students and faculty, they became more
liberal
❖ “Becoming radical meant thinking for myself and, guratively, thumbing my nose at my family. It
also meant intellectual identi cation with the faculty and students that I most wanted to be like”
(Newcomb, 1943, pp. 134, 131)
.
fi
fi
Bennington College Study
❖ Two follow-up studies indicated that the change was largely permanent—the women remained
relatively liberal, likely in part because they picked new referent group (spouses, friends, co-workers)
that reinforced those attitudes
❖ We often choose reference groups that reinforce attitudes—but our attitudes are also changed by our
reference groups.
.
Reminders
❖ A graph or system or network is a set of units that may be (but are not
necessarily) connected to each other.
Reminders
❖ Degree: The degree ki of a vertex or node is the number of other nodes in its
neighborhood.
Reminders
❖ In a directed graph or network, the edges are not necessarily reciprocal. A may be connected to B, but
B may not be connected to A (think of a graph with arrows indicating direction of the edges.)
fi
Reminders
Random Graphs
❖ Random graph, each pair of vertices i, j has a connecting edge with an independent
probability of
❖ This graph has 16 nodes, 120 possible, and 19 actual connections—about a 1/7 probability
than any two nodes will be connected to each other.
fl
Reminders
Regular Graphs
❖ A regular graph is a network where each node has the same number (k) of neighbors (that
is, each node or vertex has degree k)
❖ A k-degree graph is seen at the left. k = 3 (each node is connected to three other nodes)
.
A B C D
❖ This is an undirected
B 1matrix—if A serves
1 with
0 B on a committee, then B serves with A on a
committee.
C 0 1 0
D 1 0 0
Graph Diameter
Small World
“diameter”
❖ In Milgram’s 1967 “small world experiment”, individuals were asked to reach a particular target
individual by passing a message along a chain of acquaintances.
❖ For successful chains, the average # of intermediaries needed was 5 (that is, 6 steps
❖ note: most chains were not completed.
:
Small Worlds
fl
s
Network Structure
❖ Commodity chains
❖ Trade systems, transport and communication
❖ Business networks
❖ City systems
❖ Interstate power
Commodity chains
City systems
❖ Treaty/alliance network
❖ Exchange of recognitio
❖ Co-membership in supra-national organizations
n
Structural Properties
fl
?
Same scaling method as Smith and White 1992 that showed a virtually
linear core-periphery structure in the contemporary world-trade system
?
fi
fi
Dynamics
Dynamics
❖ How and why do world systems expand, contract, and change structure?
❖ Homophily
❖ Exchange
❖ Power-laws (degree preference)
❖ Cohesion and shortcuts
Dynamics: Homophily
Dynamics
fi
Power law
Power law
Greek Gods: alpha=3.
no real organizationa
Biotech: alpha=2.0 cohesive
Proteome yeast: alpha=2. pure 'scale free’ alph
organization, < alph
hierarchical organizatio (H&J Newman, B.
(Powell, White, Koput, Owen-Smith)
< alpha. (Amaral) Walters)
a
Personal Network
Social network
Personal Networks
• Uniqu
• Like snow akes, no two personal
networks are exactly alik
• Social contexts may share attributes, but
the combinations of attributes are each
differen
• The differences across respondents
in uences attitudes, behaviors and
conditions
fl
e
fl
e
Personal Networks
Personal Networks
Many variables of interest to social scientists are thought to be in uenced by social
context
fl
DATA
• Composition: metrics to summarize the attributes of alters in a network.
– Average age of alters.
– Proportion of alters who are women.
– Proportion of alters that provide emotional support.
• Structure: metrics to summarize structure.
– Number of components.
– Betweenness centralization.
– Subgroups.
• Composition and Structure: Variables that capture both.
– E-I index
DATA: composition
DATA: composition
Percent of alters from host country
DATA: structure
Core/peripher
Number of component
Number of isolates
Components
Components 1 Components 10
VISUAL INSPECTION
REFERENCES
Graph Theory
Graph Theory
Origin: Leonhard Euler (1736)
i j
Bidirectional
Undirected edge:
communication/
interaction
i j
Directed edge:
❖ Directed: n(n-1)
Adjacency matrix
1
0 n vertices i=1,…,n
2 1 if (i,j)
aij= 0 if (i,j) E
3
0 1 2 3 Symmetri
0 0 1 1 1 for undirected networks
1 1 0 1 1
2 1 1 0 1
3 1 1 1 0
E
Adjacency matrix
0
1 n vertices i=1,…,n
1 if (i,j)
2 aij= 0 if (i,j) E
0 1 2 3
0 0 1 0 1
1 0 0 0 0 Non symmetri
2 0 1 0 0 for directed networks
3 0 1 1 0
E
Sparse graphs
Density of a grap
Number of edges
D=
Maximal number of edges
Paths i3
i4
i5
i0
i1 i2
G=(V,E
Cycle/loop = closed path (i0=il) with all other vertices and edges
distinct
)
is connected
G=(V,E) is connected if and only if there exists a path connecting any two
vertices in G
is not connecte
is formed by two component
Is a forest
d
Trees
A tree is a connected graph without loops/cycles
Existence of a giant
component
Disconnecte
components
d
Shortest paths
Shortest path between i and j: minimum number of traversed
edges
j
distance l(i,j)=minimum number
of edges traversed on a path
between i and j
i
Motifs
Paths
Stars
Cycles
Complet
Motifs
wij
i j
wij: continuous variable
fi
s
Weights: examples
Scienti c collaborations: number of common papers
Internet, emails: traf c, number of exchanged emails
Airports: number of passengers
Metabolic networks: uxes
usually wii=0
symmetric: wij=wji
fi
fi
fl
Weighted networks
Clustering
6 possible connections
Node among the Neighbours
4 Neighbours (N) (Nx(N-1)/2)
2 Connections
among the
Neighbours
Clustering coef cient: the average proportion of neighbours of a vertex that are
themselves neighbours
❖ Degree centrality refers to the number of ties a node has to other nodes.
Actors who have more ties may have multiple alternative ways and
resources to reach goals—and thus be relatively advantaged.
.
❖ One is in-degree centrality: An actor who receives many ties, they are characterised
as prominent. The basic idea is that many actors seek to direct ties to them—and so
this may be regarded as a measure of importance
❖ The other is out-degree centrality. Actors who have high out-degree centrality may
be relatively able to exchange with others, or disperse information quickly to many
others. (Recall the strength of weak ties argument.) So actors with high out-degree
centrality are often characterized as influential.
❖ Consider the network. Which nodes (actors) are more “central” than others
❖ 2, 5, and 7 appear relatively “central”.
Star Network
Degree Centrality: Bonacich
❖ Another measure of degree centrality takes into account the problem that the power and centrality of
each node (actor) depends on the power and centrality of the others.
❖ Bonacich used an iterative estimation approach which weights each node’s centrality by the centrality
of the other nodes to which it is connected
❖ So, node 1’s centrality depends not only on how many connections it has—but also on how many
connections its neighbors have (and on how many connections its neighbors’ neighbors have, and
so on.)
.
❖ When calculating out the Bonacich Power measures, the “attenuation factor”
represents the weight—an “attenuation factor” that is positive (between 0 and
1) means that one’s power is enhanced by being connected to well-connected
neighbors
❖ Recall the graph presented above, in which actors #5 and #2 were the most central.
Calculating out Bonacich measures suggests that actors #8 and #10 are also central
—they don’t have many connections, but they have the “right” connections
fi
Degree Centrality
❖ As with all quantitative methods, it’s important to think about what you as
a researcher are trying to measure before using the methods.
Centrality: Closeness Centrality
❖ Another way to think of closeness is to move away from thinking just about
the geodesic or most ef cient (shortest) path from one node to another—but
to also think about all connections of one node wrt to all the others.
fi
Closeness: Influence Measures
❖ There are several such measures: Hubbell, Katz, Taylor, Stephenson, and Zelen
❖ Hubbell and Katz methods count the total number of connections between nodes
(and do not distinguish between directed and non-directed data), but use an
attenuation factor to discount longer paths. The two measures are very similar; the
Katz measure uses an identity matrix (each node is connected to itself) while the
Hubbell measure does not.
❖ The Taylor measure also uses an attenuation factor, but is more useful for
measuring the balance of in- versus out-ties in directed graphs. Positive
values of closeness indicate relatively more out-ties than in-ties.
Centrality: Node Betweenness
❖ Edge betweenness is the degree to which an edge makes other connections possible
❖ Recall the example we used earlier, and look at the edge from 3 to 6.
.
❖ That edge from 3 to 6 makes many other edges possible—without that edge,
6 would be relatively isolated.
Centrality: Levels of Hierarchy
❖ We can continue to remove nodes, and measure the # of levels of hierarchy exist in
the network or system.
❖ What if two nodes want to have a relationship, but the path between them is
blocked by a reluctant intermediary?
❖ Another pathway allows for an alternative resource
❖ The ow approach to centrality assumes that nodes will use all the pathways that
connect them.
❖ For each node, the measure re ects the # of times the node is in a ow (any ow)
between all other pairs of node (generally, as a ratio of the total ow betweenness
that does not involve the node).
fl
fl
fl
fl
fl
Centrality measures
Betweenness centrality
Path-based quantity
i bi is large
j bj is small
Eigenvector centrality
x5
x1 xi
x2
i x4
x3
Histogram
nk= number of vertices with degree k
Distribution
P(k)=nk/n=probability that a randomly chosen
vertex has degree k
Cumulative distribution:
P>(k)=probability that a randomly chosen
vertex has degree at least k
:
Statistical characterization
Cumulative degree distribution
Simplest case
P(k’|k): conditional probability that a vertex of degree k is
connected to a vertex of degree k’
often inconvenient (statistical uctuations)
:
fl
Statistical characterization
Multipoint degree correlations
Practical measure of correlations:
Average degree of nearest neighbors
ki=4
knn,i=(3+4+4+7)/4=4.5
Typical correlations
❖ Assortative behaviour: growing knn(k
Example: interne
● Small-world phenomeno
● Preferential attachment
y
probability
⦿ P(rank=k) =
p
❖ Tree-like!
s
γ = 2.1…4
❖ very low connection numbers are possible
ER model VS. Scale-free network
❖ ER: same average # of connections per node – tree-lik
❖ SF: hubs, few nodes with large # of connections – hierarchy!
SF
s
OR
❖ Preferential attachmen
❖ Probability of a new node to attach to existing one rises with rank of
nod
❖ P(attach to node V) ~ rank(V)
e
fi
d
Barabasi Model
❖ Produces scale-free
network
❖ Scale-free distribution – time-
invariant. Stays the same as more
nodes added
s
Barabasi Model
ER Vs. Barabasi
❖ Graph diameter
❖ the average length of shortest distance between any two vertice
❖ For same number of connections and nodes, ER has larger diameter than
scale-free network
❖ No small-world in ER!
:
In economy this
is called
Matthew’s
effect: The rich
get richer. This
explain the
Pareto’s
distribution of
wealth
How can a scale-free network emerge?
Attack = removal
of highly-
connected node
❖ “Small-world” phenomenon, or
❖ “6 degrees of separation
❖ Stanley Milgram, 1967, Psychology today
”
Small-world experiment
Small-world experiment
● Kahenman, Auman:
● Average: ~3
6
fi
)
fi
fi
d
Random Networks
❖ Random networks have randomly
connected edge
❖ If the number of edges is M, each node
for large n
s
❖ Such that
❖ The network exhibit the small world
as regular lattice
❖ Most real networks, instead, follow a “power law” distribution for the
node connectivit
❖ In general term, a probability distribution is “power law” i
❖ The probability P(k) that a given variable k has a speci c valu
❖ Decreases proportionally to k power -γ , where γ is a constant valu
P( k ) = αk −γ
y
fi
f
Is there a really
0,75
substantial
The exponential difference
P(k) 0,5 distribution
decays The power law
exponentially distribution decays as Let’s see the
0,25 polinom same
distributio
0 on a log-log
1 5 10 15 20 25 30
gure…
k
fi
a
average number of links for page, would be less in the order of 1-20
❖ which contradicts the fact that we know a lot of “well linked” sites…
s
fi
”
fi
∞
proteins
❖ The power of earthquakes: statistical data tell us that the power of earthquakes follow a power-law
distributio
❖ The size of rivers: the size of rivers in the world is is power la
❖ The size of industries, i.e., their overall incom
❖ The richness of peopl
❖ In these examples, the exponent of the power law distribution is always around 2.
❖ The power law distribution is the “normal” distribution for complex systems (i.e., systems of interacting
autonomous components)
❖ We see later how it can be derived…
)
fi
fi
n
❖ Example
❖ 20% of the Web sites gests the 80% of the visits (actual data: 15%-85%)
❖ 20% of the Internet routers handles the 80% of the total Internet traf
❖ 20% of world industries hold the 80% of the world’s incom
❖ 20% of the world population consumes the 80% of the world’s resource
❖ 20% of the Italian population holds the 80% of the lands (that was true before the
fi
”
fi
of the networ
❖ This may have notable implications, as detailed below
k
❖ The overall properties of the network are preserved independently of the scal
❖ In particular
❖ If we cut off the details of a network – skipping all nodes with a limited
Web Cach
Network
e
Protei
Network
n
The Interne
Routers
t
❖ The fractal term derives from the fact that they have a non-integer dimensio
❖ 2-d objects have a “size” (i.e., a surface) that scales with the square of the linear size
A=kL2
❖ 3-d objects have a “size” (i.e., a volume) that scales with the cube of the linear size
V=kL3
❖ Fractal objects have a “size” that scales with some fractions of the linear size S=kLa/b
Examples of Fractals
❖ Yes, in fact
❖ They are the same at whatever dimension we observe the
❖ Also, the fact that they grow according to a power law can be considered as a sort of
fi
fi
…
Barabasi Model
Summarizing
❖ The Barabasi-Albert model is very powerful to explain the structure of modern networks, but has
some limitation
❖ With the proper extensions (re-wiring, node aging and link costs, tness
❖ It can capture the structure of modern network
❖ The “rich get richer” phenomeno
❖ As well as “the winner takes it all phenomena
❖ In the extreme case, when tness and node re-wiring are allowed, it may happens that the network degenerates
with a single node that attracts all link (monopolistic networks
fi
n
fi
)
Growing Networks
❖ The probability that a new node connect to another existing node may depend on the
characteristics of the existing nod
❖ This is not simply a random process of independent node addition
❖ But there could be “preferences” in adding an edge to a nod
❖ E.g.,. Google, a well known and reliable Internet router, a cool guy who knows many
fi
s
Evolving Networks
❖ More in general
❖ Networks grows AN
❖ Network evolve
❖ Connection satisfactio
❖ What matters is that connections can change during the life of the networ
❖ Not necessarily in a random wa
❖ But following characteristics of the network
Preferential Attachment
❖ At each time step, add a new node that has m edges that m ≤ m0
link to m existing nodes in the syste
ki
❖ When choosing the nodes to which to attach, assume a
probability ∏ for a node i proportional to the number ki
Π ( ki ) =
of links already attached to i k
∑ j
j
❖ After t time steps, the network will have n=t+m0 nodes
and M=mt edge
n = t + m0
This leads to a power law network!
M = mt
❖
s
❖ In a random network model, each new node that attach to the network attach its edges
independently of the current situatio
❖ Thus, all the events are independen
❖ The probability for a node to have a certain number of edges attached is thus a “normal”,
exponential, distributio
k
1 −
P (k ) = e m
m
n
❖ In its simplicity, the BA model captures the essential characteristics of a number of phenomen
❖ In which events determining “size” of the individuals in a networ
❖ Are not independent from each othe
❖ Leading to a power law distributio
❖ So, it can somewhat explain why the power law distribution is as ubiquitous as the normal Gaussian distributio
❖ Example
❖ Gnutella: a peer which has been there for a long time, has already collected a strong list of
acquaintances, so that any new node has higher probability of getting aware of it
❖ Rivers: the eldest and biggest a river, the more it has probability to break the path of a new
river and get its water, thus becoming even bigger
❖ Industries: the biggest an industry, the more its capability to attract clients and thus become
even bigger
❖ Earthquakes: big stresses in the earth plaques can absorb the effects of small earthquakes,
this increasing the stress further. A stress that will eventually end up in a dramatic earthquakes
❖ Richness: the rich I am, the more I can exploit my money to make new money ! “RICH GET
RICHER”
s
❖ Clusterin
❖ There are no analytical results availabl
❖ Simulations shows that in scale-free networks the
fi
Problems of the
Barabasi Albert Model (1)
fi
.
Problems of the
Barabasi Albert Model (2)
❖ In genera
❖ The distribution has still a “heavy tailed” is compared to standard exponential distributio
❖ However, such tail is not in nit
is often limite
❖ So, there can be no individual that can sustain any large number of resource
❖ Viceversa, there could be a minimal amount of resources a node can hav
fi
e
of TCP connections
s
❖ However, it can be shown that these models destroy the power-law nature of the
network
❖ The problems of the BA Model may depend on the fact that networks not only grow but
also evolv
❖ The BA model does not account for evolutions following the growt
evolutio
❖ And Obtain a bit more realistic model
n
in nit
❖ This enables explaining the various exponents that are measured in real networks
fi
y
number of link
❖ E.g., for a Web site this implies adding more
❖ Not all nodes are equal, but some nodes “ t” better speci c network characteristic
❖ E.g. Google has a more effective algorithm for pages indexing and rankin
❖ A new scienti c paper may be indeed a breakthroug
❖ It can be shown that the tness model for preferential attachment enables even very young
nodes to attract a lot of links
µi ki
Π ( ki ) =
∑ µ jk j
j
fi
w
fi
s
fi
fi
fi
g
such error
❖ At least only a few small clusters of nodes will disconnect to the networ
❖ The average path length remains the same
Characteristi
Path Lenght
c
fi
k
Attack Tolerance
❖ Scale free networks are very sensitive to targeted attack
❖ If the most connected nodes get deliberately chosen as targets of attack
❖ The average path length of the network grows very soo
❖ It is very likely that the network will break soon into disconnected cluster
❖ Although these independent clusters still preserves some internal connection
Characteristi
Path Lenght
c
connected cluste
❖ Or, the (1-pc) percentage of nodes that must be disconnected to have the network break into
disconnected cluster
“giant” on
❖ In fac
❖ If the percentage (1-pc) of immune nodes are able to block the spreading of an infectio
❖ This implies that if these nodes were disconnected from the network, they would signi cantly break the
❖ This understood, what can be said about epidemics in scale free networks?
t
fi
n
slides before
fi
d
❖ The structure of the Internet is very robust in the presence of router fault
❖ Several routers can fails, and they do everyday, without causing signi cant partitionings
of the networ
disconnecte
❖ E.g., the destroying of World-Trade-Center routers – acting as main hubs for Europe-
fi
s
❖ This will increase the probability of it becoming more and more visibl
❖ However, we must always consider that random processes still play an important role
s
fi
”
❖ Recently, a few innovative industries have tried to study the structure of social networ
❖ And have understood that to launch a new product is important to identify the “hubs” of the social network
❖ And have this hubs act as the engine for the launch of the produc
women,
❖ After which, paying such identi ed hubs to support the product (e.g., wearing a new pair of shoes
fi
.
fl
?
fl
t
in any cas
fi
fi
?
fl
g
fi
e
of real systems
❖ However, both systems capture some interesting propertie