0% found this document useful (0 votes)
5 views202 pages

5Network.01.Intro

The document provides an introduction to graph theory and network analysis, highlighting the significance of networks in various fields such as sociology, biology, and computer science. It discusses key concepts such as nodes, edges, centrality, and the dynamics of networks, including the influence of social context on individual behaviors. Additionally, it emphasizes the importance of systematic network analysis for strategy development and understanding complex relationships within different types of networks.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
5 views202 pages

5Network.01.Intro

The document provides an introduction to graph theory and network analysis, highlighting the significance of networks in various fields such as sociology, biology, and computer science. It discusses key concepts such as nodes, edges, centrality, and the dynamics of networks, including the influence of social context on individual behaviors. Additionally, it emphasizes the importance of systematic network analysis for strategy development and understanding complex relationships within different types of networks.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 202

Introduzione ai Grafi

AGENDA

I. Intro
II. Histor
III. Graph Theory
IV. Centralit
V. Statistical characterisatio
VI. Random vs Scale Fre
VII. Modern Network
VIII. Barabasi Model
y

What is a network?

Network or graph is a
set of vertices or nodes joined by edges

abstract representatio
very genera
convenient to describ
many different systems
l

Internet routing map, 1999


http://www.cheswick.com/ches/map/
Power grid, USA, 2001

http://www.technologyreview.com/Energy/12474/page2/
Sexual / Romantic partners network

Bearman, Moody, Stovel. Chains of Affection: The Structure of Adolescent Romantic and Sexual Networks. AJS, 2004 Jefferson High, Columbus, Ohio
Metabolic network of E. Coli
Organisation chart
Some examples
Vertices Edges

Social networks Individuals Social relations

Router Cable
Internet
AS Commercial agreements

WWW Webpages Hyperlinks

Protein interaction networks Proteins Chemical reactions

and many more (email, P2P, transport….)


s

Interdisciplinary science
Science of complex networks

-graph theor

-sociolog

-communication scienc

-biolog

-physic

-computer science
y

Interdisciplinary science

Science of complex networks

❖ Empiric

❖ Characterizatio

❖ Modelin

❖ Dynamical processes
s

History
Network Analysis
Network Analysis

❖ Multi disciplinar
❖ social network
❖ political network
❖ electrical network
❖ transportation network
❖ biological network
s

Network Analysis

❖ Used for developing a strategy or plan


… or extracting knowledg
❖ First of all: start with a clear idea about the scope
❖ Analyze the network, and based on this analysis build a strategy
e

Network Analysis

❖ In this strategy, you can use your knowledge about the role and meaning of
nodes and relations
❖ Use strategic thinking, this kind of knowledge is commonly used without
making a network analysis, but then people often rely only on intuition.
❖ The network analysis makes is possible to work more systematically.
❖ Examine which actors (groups, organizations, institutions and persons) play,
or might play a role in a given context

Network Analysis
❖ By systematically analysing a network, you can (i.e.)
- identify collaborative/antagonist schema
- obtain more ef cient ow of information
- optimize a strategy for attending your goal

This picture is usually only a snapsho


❖ A network is in fact dynamic: new actor appears, old ones disappear, their
relations may chang
❖ The shape of the network may change
fi
e

fl
t

Network Analysis
❖ set the goa
❖ identify central actors and brainstorm about the
❖ selection of actors
❖ signi cance, metrics, topologynetwork mappin
❖ analysi
❖ Identify and de ne: problems, goals, constraint
❖ build a strategy
fi
s

fi
:

Network Analysis

❖ Network analysis is a method by which one can analyze the connections


across actors to examine how they are interrelated.
History
History

❖ Much early research in network analysis is found in educational psychology, and


studies of child development. Network analysis also developed in elds such as
sociology and anthropology.

fi
Social Network Analysis

❖ Social network analysis, unlike many other sociological methods, focuses on


interaction (rather than on individual behavior).

❖ Topology, environment and con guration of the networks itself in uence


system-wide actors’ interaction
fi
fl
Bennington College Study
(1935-1939)

❖ Theodore Newcomb found that as Bennington college women were exposed to the
relatively liberal referent group of fellow students and faculty, they became more
liberal

❖ “Becoming radical meant thinking for myself and, guratively, thumbing my nose at my family. It
also meant intellectual identi cation with the faculty and students that I most wanted to be like”
(Newcomb, 1943, pp. 134, 131)
.

fi
fi
Bennington College Study

❖ Two follow-up studies indicated that the change was largely permanent—the women remained
relatively liberal, likely in part because they picked new referent group (spouses, friends, co-workers)
that reinforced those attitudes

❖ In other words, attitudes have a “social-adjustment” function

❖ We often choose reference groups that reinforce attitudes—but our attitudes are also changed by our
reference groups.
.

Reminders

❖ A node or vertex is an individual unit in the graph or system. (If it is a network of


legislators, then each node represents a legislator).

❖ A graph or system or network is a set of units that may be (but are not
necessarily) connected to each other.
Reminders

❖ An “edge” is a connection or tie between two nodes

❖ A neighborhood N for a vertex or node is the set of its immediately connected


nodes.

❖ Degree: The degree ki of a vertex or node is the number of other nodes in its
neighborhood.
Reminders

❖ In an undirected graph or network, the edges are reciprocal.


If A is connected to B, B is by de nition connected to A.

❖ In a directed graph or network, the edges are not necessarily reciprocal. A may be connected to B, but
B may not be connected to A (think of a graph with arrows indicating direction of the edges.)
fi
Reminders
Random Graphs

❖ Random graph, each pair of vertices i, j has a connecting edge with an independent
probability of

❖ This graph has 16 nodes, 120 possible, and 19 actual connections—about a 1/7 probability
than any two nodes will be connected to each other.

❖ In a random graph, the presence of a connection between A and B as well as a connection


between B and C will not in uence the probability of a connection between A and C.
p

fl

Reminders
Regular Graphs

❖ A regular graph is a network where each node has the same number (k) of neighbors (that
is, each node or vertex has degree k)

❖ A k-degree graph is seen at the left. k = 3 (each node is connected to three other nodes)
.

A Very Simple Example

A B C D

❖ Four persons whether


A
they serve
1
on
0
at least
1
one committee together

❖ This is an undirected
B 1matrix—if A serves
1 with
0 B on a committee, then B serves with A on a
committee.
C 0 1 0

D 1 0 0

Graph Diameter

❖ Diameter is the “longest shortest path” between two node

❖ The graphs above have diameters of 3, 4, 5, and

❖ The graph on the right has a larger diameter:


it takes (at most) 7 edges to travel between two node
❖ the two nodes at the bottom are loosely connected

Small World

❖ Another concept that has emerged in network analysis:


Small World

❖ Some reminders rst


❖ types of graphs (“random graphs” and “regular graphs”), concepts of “clustering” and

“diameter”

❖ A “small world” can be thought of in-between a random and a regular graph.


fi

It’s a Small World, After All

❖ Six degrees of separation


❖ the number of “links” needed to connect any one arbitrarily chosen individual to any other is low
❖ networks have lower diameters than expecte

❖ In Milgram’s 1967 “small world experiment”, individuals were asked to reach a particular target
individual by passing a message along a chain of acquaintances.
❖ For successful chains, the average # of intermediaries needed was 5 (that is, 6 steps
❖ note: most chains were not completed.
:

Small Worlds

❖ Graph is highly clustered:

❖ higher proportion (than random) of each


node’s neighbors are actually connected
to each other.

❖ Small diameter, relative to the number of


nodes.
REAL WORLD NETWORK
Network Structure

❖ Nodes can be individuals, organizations, locations, or analytical aggregate


❖ Relations can be material exchange, information ow, or shared statu
❖ What is fundamental are the ties or absence of ties between actors, in
addition to the attributes of the actors

fl
s

Network Structure

❖ Commodity chains
❖ Trade systems, transport and communication
❖ Business networks
❖ City systems
❖ Interstate power
Commodity chains

White’s analysis of the


input-output matrix of the
Danish economy – seen as
a network – scaled by
equivalence of position.
(available for the U.S.,
U.K, Holland, Italy,
France, Australia)
Trade network (13th century)
Business networks
Evolution of the interorganization contracts
network in biotech – R&D and VC links for
❖ Corporate interlocks 1989 – 1999 (Powell, White, Koput and Owen-
Smith forthcoming, AJS)
❖ Market exchange
❖ Shared technology (e.g. licensing)
❖ Shared niche space
❖ Business groups
s

City systems

Settlement systems have


been seen as systems that
evolve toward hierarchical
networks.

Networks like this may


have an exponential
degree distribution.
Interstate power

❖ Treaty/alliance network
❖ Exchange of recognitio
❖ Co-membership in supra-national organizations
n

Structural Properties

❖ Density, degree, reac

❖ Centrality and powe

❖ Cohesion and sub-group

❖ Positions and roles


h

Density, degree, reach

❖ How much connection is there

❖ Which nodes have how much connection (social capital)

❖ Which actors are closest to, most in uenced by which others?


?

fl
?

Centrality and power

❖ Which actors have most ties?

❖ Which actors are closest to most others?

❖ Which actors are “between” others?

Genoa betweenness in trading paths during XIV century


Cohesion and sub-groups
❖ Are there blocs or clusters or sub-
groups
❖ Which actors are connected, how
tightly, to which groups
❖ What roles do actors have with
respect to relations between groups
❖ Level of cohesive membership as a
predictive variable

(Predictive Structural Cohesion theory)


?

Roles and positions


Regular equivalence of positions in the 13th century main
European banking/trading network
❖ Can actors be classi ed according
to which other actors they have
ties to
❖ Can actors be classi ed according
to which other kinds of actors
they have ties to
❖ Actors “roles” in the structure
(e.g. “core nation”)

Same scaling method as Smith and White 1992 that showed a virtually
linear core-periphery structure in the contemporary world-trade system
?

fi
fi
Dynamics

❖ Actors make relation


❖ Relations condition actor
❖ Micro!macro links between probabilistic attachment bias and network
topologie
❖ Macro!micro effects of network topologies on actor activities and
behaviors
s

Dynamics

❖ How and why do world systems expand, contract, and change structure?
❖ Homophily
❖ Exchange
❖ Power-laws (degree preference)
❖ Cohesion and shortcuts
Dynamics: Homophily

❖ Forming (or breaking) ties is not rando


❖ Actors may have preferences to form (or sustain) ties with “similar” other
❖ The macro-result is local clustering and formation of factions

Dynamics

❖ Ties may be formed (or dissolved) proportional to the cost/bene ts to actors,


and
❖ Constraints due to presence of relations and existing embedding (alternatives
available to each actor
❖ Macro-result may tend to “structural holes” and extended networks

fi
Power law

❖ Actors with ties exploit their ties to make more tie


❖ Actors with few ties may establish ties with actors with more tie
❖ Both tendencies have the macro-result of exponential distributions of ties
❖ Exponential networks create relatively short average path-lengths (shortcuts)
unless the hub distributions are too extreme

Power law
Greek Gods: alpha=3.
no real organizationa
Biotech: alpha=2.0 cohesive
Proteome yeast: alpha=2. pure 'scale free’ alph
organization, < alph
hierarchical organizatio (H&J Newman, B.
(Powell, White, Koput, Owen-Smith)
< alpha. (Amaral) Walters)
a

Personal Network
Social network

Personal (Egocentric) Network Whole (Sociocentric) Network


• Effects of social context on individual • Interaction within a socially or
attitudes, behaviors and condition geographically bounded grou
• Collect data from respondent (ego) about • Collect data from group members about
interactions with network members (alters) their ties to other group members in a
in all social settings. selected social setting.
p

Personal Networks
• Uniqu
• Like snow akes, no two personal
networks are exactly alik
• Social contexts may share attributes, but
the combinations of attributes are each
differen
• The differences across respondents
in uences attitudes, behaviors and
conditions
fl
e

fl
e

Personal Networks

Ascribed characteristic Chosen characteristic


Se Incom
Ag Occupatio
Rac Hobbie
Place of birt Religio
Family tie Location of hom
Genetic attributes Amount of travel
x

Personal Networks
Many variables of interest to social scientists are thought to be in uenced by social
context

Social outcome Health outcome


Personalit Smokin
Acculturatio Depressio
Well-bein Fertilit
Social capita Obesity
Social support
y

fl
DATA
• Composition: metrics to summarize the attributes of alters in a network.
– Average age of alters.
– Proportion of alters who are women.
– Proportion of alters that provide emotional support.
• Structure: metrics to summarize structure.
– Number of components.
– Betweenness centralization.
– Subgroups.
• Composition and Structure: Variables that capture both.
– E-I index
DATA: composition

• Proportion of personal network that are wome

• Average age of network alter

• Proportion of strong tie

• Average number of years knowing alters


s

DATA: composition
Percent of alters from host country

36 Percent Host Country 44 Percent Host Country

Captures composition, Does not capture structur


e

DATA: structure

Average degree centrality (density

Average closeness centralit

Average betweenness centralit

Core/peripher

Number of component

Number of isolates

Components

Components 1 Components 10

Components captures network structure (separate groups


Does not capture composition (type of groups
)

Average Betweenness Centrality

Average Betweenness 12.7 Average Betweenness 14.6


SD 26. SD 40.

BC captures bridging between group


Does not capture the composition of bridged group
5

VISUAL INSPECTION
REFERENCES

❖ Steve Borgatti’s sit


❖ Note the “Networks for Newbies” presentation (Wellman) on the websit
❖ From Sociology 712 (Moody) at Duk
❖ From Friedkin’s “Intro to Social Network Methods” (UCSB
❖ From Martin and Montgomery’s “New Methods of Social Network Analysis
❖ Andrej Mrvar’s site
e

Graph Theory
Graph Theory
Origin: Leonhard Euler (1736)

L. Euler, Solutio problematis ad geometriam situs pertinentis, Comment.


Academiae Sci. J. Petropolitanae 8, 128-140 (1736)
Graph theory: basics
Graph G=(V,E
❖ V=set of nodes/vertices i=1,…,
❖ E=set of links/edges (i,j),

i j
Bidirectional
Undirected edge:
communication/
interaction

i j
Directed edge:

Graph theory: basics

Maximum number of edge

❖ Undirected: n(n-1)/ Complete graph:

❖ Directed: n(n-1)

(all to all interaction/communication)


2

Adjacency matrix
1
0 n vertices i=1,…,n

2 1 if (i,j)
aij= 0 if (i,j) E
3

0 1 2 3 Symmetri
0 0 1 1 1 for undirected networks

1 1 0 1 1
2 1 1 0 1
3 1 1 1 0
E

Adjacency matrix
0
1 n vertices i=1,…,n

1 if (i,j)
2 aij= 0 if (i,j) E

0 1 2 3
0 0 1 0 1
1 0 0 0 0 Non symmetri
2 0 1 0 0 for directed networks
3 0 1 1 0
E

Sparse graphs
Density of a grap

Number of edges
D=
Maximal number of edges

Sparse graph: D <<1


Sparse adjacency matrix
Representation: lists of neighbours of each vertex
l(i, V(i)) V(i)=neighbourhood of i
h

Paths i3
i4

i5
i0
i1 i2
G=(V,E

Path of length l = ordered collection of

❖ l+1 vertices i0,i1,…,il ε

❖ l edges (i0,i1), (i1,i2)…,(il-1,il) ε E

Cycle/loop = closed path (i0=il) with all other vertices and edges
distinct
)

Paths and connectedness

is connected

G=(V,E) is connected if and only if there exists a path connecting any two
vertices in G

is not connecte
is formed by two component
Is a forest
d

Trees
A tree is a connected graph without loops/cycles

❖ n vertices, n-1 edge

❖ Maximal loopless grap

❖ Minimal connected graph


s

Paths and connectedness


Giant component = component whose size scales with the
number of vertices n

G=(V,E)=> distribution of components’ sizes

Existence of a giant
component

Macroscopic fraction of the graph is


connected
Paths and connectedness: directed graphs
Paths are directed

Giant SCC: Strongly


Connected Component Giant OUT
Giant Component
Component

Disconnecte
components
d

Shortest paths
Shortest path between i and j: minimum number of traversed
edges

j
distance l(i,j)=minimum number
of edges traversed on a path
between i and j
i

Diameter of the graph= max[l(i,j)]


Average shortest path= ∑ij l(i,j)/(n(n-1)/2)
Complete graph: l(i,j)=1 for all i,j
“Small-world” ➔ “small” diameter

Motifs

Motifs: subgraphs occurring more often than on random


versions of the graph

Signi cance of motifs:


Z-score!
fi
Motifs

Paths

Stars

Cycles

Complet
Motifs

Sub-graphs more represented than expected

209 bi-fan motifs found in the


E.coli regulatory network
Weighted networks
General description: weight

Real world networks: edge

carry traf c (transport networks, Internet…

have different intensities (social networks…)

wij
i j
wij: continuous variable
fi
s

Weights: examples
Scienti c collaborations: number of common papers
Internet, emails: traf c, number of exchanged emails
Airports: number of passengers
Metabolic networks: uxes

usually wii=0
symmetric: wij=wji
fi
fi
fl
Weighted networks

Weights: on the edge


Strength of a vertex:
si = ∑jεV(i) wij

=>Naturally generalizes the degree to weighted network

=>Quanti es for example the total traf c at a vertex


s

Clustering
6 possible connections
Node among the Neighbours
4 Neighbours (N) (Nx(N-1)/2)

2 Connections
among the
Neighbours

Clustering coef cient: the average proportion of neighbours of a vertex that are
themselves neighbours

Clustering for the node = 2/6


Clustering coef cient: Average over all the nodes
fi
fi
Clustering
k
Clustering coef cient of a vertex
SUM # links between 1,2,..n neighbours
C(i) =
k(k-1)/2
i

Clustering: My friends will know each other with high probability!


(typical example: social networks)

Average clustering coef cient of a graph C=∑i C(i)/n


fi
fi
Centrality
Basic Concepts: Centrality

❖ Centrality is a measure of how many connections one node has to other


nodes

❖ Degree centrality refers to the number of ties a node has to other nodes.
Actors who have more ties may have multiple alternative ways and
resources to reach goals—and thus be relatively advantaged.
.

Basic Concepts: Degree Centrality

❖ Degree centrality for an undirected graph is straightforward—if A is


connected to B, then B is by de nition connected to A
❖ Degree centrality for a directed graph or network has one of two forms.
fi
.

Degree Centrality—Directed Networks

❖ One is in-degree centrality: An actor who receives many ties, they are characterised
as prominent. The basic idea is that many actors seek to direct ties to them—and so
this may be regarded as a measure of importance

❖ The other is out-degree centrality. Actors who have high out-degree centrality may
be relatively able to exchange with others, or disperse information quickly to many
others. (Recall the strength of weak ties argument.) So actors with high out-degree
centrality are often characterized as influential.

Degree Centrality: Individual and Network

❖ Consider the network. Which nodes (actors) are more “central” than others
❖ 2, 5, and 7 appear relatively “central”.

Degree Centrality (directed networks)


❖ So, node 7 has an in-degree centrality absolute value
of 9 (there are 9 other nodes connected to node 7).
The normalized value is 100 (all possible other nodes
are connected to node 7). The out-degree centrality
has an absolute value of 3 (node 7 is connected out
to nodes 2, 4, and 5), and a normalized value of 33.33
(3 nodes is 33.33% of the possible 9 nodes to which
node 7 could extend out.

❖ The average outdegree is 4.9 (which means that each


node has, on average, connections out to 4.9 other
nodes); the average indegree is also 4.9.
Normalized, both measures are 54.44 (that is, 4.9 /
9).
)

Centrality: Network Degree Centralization


❖ One can also calculate network indegree and
outdegree centralization. These network
measures represent the degree of inequality or
variance in our network as a percentage of that
in a perfect “star network” – the most unequal
type of network.

❖ A depiction of a star network is on the next


slide—note that only one node is connected to
any of the others, and that node is connected to
all of the others.

Star Network
Degree Centrality: Bonacich

❖ Another measure of degree centrality takes into account the problem that the power and centrality of
each node (actor) depends on the power and centrality of the others.

❖ Bonacich used an iterative estimation approach which weights each node’s centrality by the centrality
of the other nodes to which it is connected

❖ So, node 1’s centrality depends not only on how many connections it has—but also on how many
connections its neighbors have (and on how many connections its neighbors’ neighbors have, and
so on.)
.

Degree Centrality: Bonacich

❖ When calculating out the Bonacich Power measures, the “attenuation factor”
represents the weight—an “attenuation factor” that is positive (between 0 and
1) means that one’s power is enhanced by being connected to well-connected
neighbors

❖ Alternatively, one could argue that actors who are well-connected to


individuals who are not well-connected themselves are powerful, because
others are “dependent” on them.
.

Degree Centrality: Bonacich


Degree Centrality: Bonacich

❖ Recall the graph presented above, in which actors #5 and #2 were the most central.
Calculating out Bonacich measures suggests that actors #8 and #10 are also central
—they don’t have many connections, but they have the “right” connections

❖ However, taking another approach (using a negative attenuation factor) identi es


actors 3, 7, and 9 as being strong – because they have weak neighbors (who are
“dependent” on them).

fi
Degree Centrality

❖ As with all quantitative methods, it’s important to think about what you as
a researcher are trying to measure before using the methods.
Centrality: Closeness Centrality

❖ Closeness is a measure of the degree to which an individual is near all other


individuals in a network. It is the inverse of the sum of the shortest distances
between each node and every other node in the network

❖ Closeness is the reciprocal of farness

❖ Nearness can also be standardised by norming it against the minimum possible


nearness for a graph of the same size and connection.
.

Centrality: Closeness Centrality

❖ Closeness can also be calculated as a measure of inequality in the distribution of


distances

❖ An actor can be very close to a relatively closed subset of a network—or moderately


close to every actor in a large network—and receive the same closeness score. In
reality, the two are very different.
.

Closeness: Influence Measures

❖ Another way to think of closeness is to move away from thinking just about
the geodesic or most ef cient (shortest) path from one node to another—but
to also think about all connections of one node wrt to all the others.
fi
Closeness: Influence Measures

❖ There are several such measures: Hubbell, Katz, Taylor, Stephenson, and Zelen

❖ Hubbell and Katz methods count the total number of connections between nodes
(and do not distinguish between directed and non-directed data), but use an
attenuation factor to discount longer paths. The two measures are very similar; the
Katz measure uses an identity matrix (each node is connected to itself) while the
Hubbell measure does not.

Closeness: Influence Measures

❖ The Taylor measure also uses an attenuation factor, but is more useful for
measuring the balance of in- versus out-ties in directed graphs. Positive
values of closeness indicate relatively more out-ties than in-ties.
Centrality: Node Betweenness

❖ Betweenness—Betweenness is a measure of the extent to which a node is connected


to other nodes that are not connected to each other. It’s a measure of the degree to
which a node serves as a bridge

❖ This measure can be calculated in absolute value, as well as in terms of a normed


percentage of the maximum possible betweenness that a node could have had.
.

Centrality: Edge Betweenness

❖ In addition to calculating betweenness measures for nodes, we can also calculate


betweenness measures for edges

❖ Edge betweenness is the degree to which an edge makes other connections possible

❖ Recall the example we used earlier, and look at the edge from 3 to 6.
.

Centrality: Edge Betweenness

❖ That edge from 3 to 6 makes many other edges possible—without that edge,
6 would be relatively isolated.
Centrality: Levels of Hierarchy

❖ One can also identify levels of hierarchy.


❖ By eliminating all the nodes with no betweenness (that is, the “subordinates”),
some of the remaining nodes will then have 0 betweennes
❖ they are at the second level of the hierarchy.

❖ We can continue to remove nodes, and measure the # of levels of hierarchy exist in
the network or system.

Centrality: Flow Betweenness

❖ What if two nodes want to have a relationship, but the path between them is
blocked by a reluctant intermediary?
❖ Another pathway allows for an alternative resource
❖ The ow approach to centrality assumes that nodes will use all the pathways that
connect them.
❖ For each node, the measure re ects the # of times the node is in a ow (any ow)
between all other pairs of node (generally, as a ratio of the total ow betweenness
that does not involve the node).
fl

fl

fl
fl
fl
Centrality measures

ki=5 Closeness centrality


gi= 1 / ∑j l(i,j)
i

How to quantify the importance of a vertex

❖ Degree=number of neighbours=∑j aij

For directed graphs: kin, kout


?

Betweenness centrality
Path-based quantity
i bi is large
j bj is small

for each pair of vertices (l,m) in the graph, there ar


σlm shortest paths between l and
σilm shortest paths going through

bi is the sum of σilm / σlm over all pairs (l,m)

NB: generalization to edge betweenness centrality


m

Eigenvector centrality
x5
x1 xi
x2
i x4
x3

Basic principle = the importance of a vertex is proportional to the


sum of the importances of its neighbors

Solution: eigenvectors of adjacency matrix!


Eigenvector centrality

Not all eigenvectors are good solutions!

Requirement: the values of the centrality measure have to be positive

Because of Perron-Frobenius theorem only the eigenvector with


largest eigenvalue (principal eigenvector) is a good solution!

The principal eigenvector can be quickly computed with the power


method!
Statistical characterisation
Statistical characterization
Degree distribution
•List of degrees k1,k2,…,kn Not very useful!

Histogram
nk= number of vertices with degree k
Distribution
P(k)=nk/n=probability that a randomly chosen
vertex has degree k
Cumulative distribution:
P>(k)=probability that a randomly chosen
vertex has degree at least k
:

Statistical characterization
Cumulative degree distribution

Conclusion: power laws and exponentials can be


easily identi ed
fi
Statistical characterization
Multipoint degree correlations
P(k): not enough to characterize a network

Large degree vertices tend to connect


to large degree vertices
Ex: social networks

Large degree vertices tend to connect


to small degree vertices
Ex: technological networks
Statistical characterization
Multipoint degree correlations
Measure of correlations:
P(k’,k’’,…k(n)|k): conditional probability that a vertex of
degree k is connected to vertices of degree k’, k’’,…

Simplest case
P(k’|k): conditional probability that a vertex of degree k is
connected to a vertex of degree k’
often inconvenient (statistical uctuations)
:

fl
Statistical characterization
Multipoint degree correlations
Practical measure of correlations:
Average degree of nearest neighbors

ki=4
knn,i=(3+4+4+7)/4=4.5
Typical correlations
❖ Assortative behaviour: growing knn(k

Example: social network

Large sites are connected with large site

❖ Disassortative behaviour: decreasing knn(k

Example: interne

Large sites connected with small sites, hierarchical structure


t

Random vs. ScaleFree


Summary
⦿ Many real-life, large-scale networks exhibit a scale-free distribution of
connectivit
⦿ Distribution is power-la
● Similar powers for networks of different type

● Small-world phenomeno

⦿ Key features to enable free-scale property


● Addition of new node

● Preferential attachment
y

Large-scale, “natural” networks

How “random” are “natural” networks?


(WWW, internet, gene regulation, …

“natural” ~ no apriori structure de ne

What are the key characteristics of natural networks?


fi
d

What is “Random Network”?

❖ Random network – ensemble of many possible networks:


❖ Fixed or un xed number of vertices (dots
❖ Fixed or un xed number of edges (lines
❖ Any two vertices have some probability of being connecte

❖ Key notion: node connectivity


❖ connectivity = number of connection

❖ First model – Erdos & Renyi, 1947 (ER)


fi
fi
s

ER random network model


⦿ Network model: a random network between n nodes
● Fix the number of vertices to

● For each possible connection between vertices v and u, connect with

probability
⦿ P(rank=k) =
p

ER random network model


❖ Feature
❖ Every node has approximatively the same number of connection
❖ connectivity is scale-dependent!
λ=λ(Ν)

❖ Tree-like!
s

ER model and real life


⦿ Real-life networks are scale-free:
● Connectivity follows power-law: P(k) ~ kγ

γ = 2.1…4
❖ very low connection numbers are possible
ER model VS. Scale-free network
❖ ER: same average # of connections per node – tree-lik
❖ SF: hubs, few nodes with large # of connections – hierarchy!

ER model VS. Scale-free network


❖ Adj. matrix of ER: ~ uniform distribution of 1’
❖ Adj. matrix of SF: 1’s lumped in columns & rows for few nodes
ER

SF
s

ER model VS. Scale-free network

Random changes in edges

OR

Addition of random links


Barabasi model

❖ Goal: generation of random network with “scale-free” propert


❖ Number of edges – not xe
❖ Continuous growt

❖ Preferential attachmen
❖ Probability of a new node to attach to existing one rises with rank of

nod
❖ P(attach to node V) ~ rank(V)
e

fi
d

Barabasi Model

❖ Produces scale-free
network
❖ Scale-free distribution – time-
invariant. Stays the same as more
nodes added
s

Barabasi Model

❖ Removal of either assumptions destroys scale-free property:

● Without node addition with time →


fully connected network after enough tim

● Without preferential attachment →


exponential connectivity

ER Vs. Barabasi

❖ Graph diameter
❖ the average length of shortest distance between any two vertice

❖ For same number of connections and nodes, ER has larger diameter than
scale-free network

❖ No small-world in ER!
:

Scale-free Network features


Networks with power-law degree distribution
are highly non-uniform.

Most of the nodes have only a few links.

A few nodes with a very large number of


links, which are often called hubs, hold these
nodes together. Networks with a power
degree distribution are called scale-free

It is the same distribution of wealth following Pareto’s 20-80 law:


Few people (20%) possess most of the wealth (80%), most of the people
(80%) possess the rest (20%)

How can a scale-free network emerge?

Network growth models: start with one vertex.


How can a scale-free network emerge?
Network growth models: new vertex attaches to existing
vertices by preferential attachment: vertex tends choose vertex
according to vertex degree

In economy this
is called
Matthew’s
effect: The rich
get richer. This
explain the
Pareto’s
distribution of
wealth
How can a scale-free network emerge?

Network growth models: hubs emerge


(WWW: new pages link to existing, well linked pages)
Scale-free Network features
Failure = removal
of random node

Attack = removal
of highly-
connected node

• Robustness to random failure


• Susceptibility to deliberate attack
Scale-free Network features

❖ “Small-world” phenomenon, or
❖ “6 degrees of separation
❖ Stanley Milgram, 1967, Psychology today

Small-world experiment

❖ Experiment: send a package from Nebraska and Kansas (central US) to


Boston, to a person the sender doesn’t know
❖ Motivation: great distance – social and geographica

❖ Only 64 of 296 packages were delivere

❖ Delivered packages: average path length ~ 6


d

Small-world experiment

A small amount of random shortcuts can decrease the path


length, still maintaining a high clustering: this model
“explains” the 6-degrees of separations in human friendship
network
Google search
❖ Brin & Page, 1998; Kleinberg, 1999

❖ Pages are ranked according to incoming links


❖ Incoming link from a high-score page is more valuabl

❖ Meaning: after random clicks, a user will be on high-ranked page

❖ Prefers old, well-connected pages

Erdos & Bacon Number

⦿ Erdos number: “collaborative distance” of a mathematician from Paul Erdo


● Average: ~

● Kahenman, Auman:

⦿ Bacon Number: “collaborative distance” of an actor from Kevin Baco


● http://oracleofbacon.org

● Average: ~3
6

Small World Networks


Examples : KBG (Kevin Bacon Graph), Grid, Worm
Most popular example !
Kevin Bacon Graph (KBG), Power Grid (Western US), C. elegans Worm, Infectious Disease
Spreading
!

Kevin Bacon Graph


Validated using IMDB (www.us.imdb.com
(150k lms, 300k combined actors)
‘Nodes’ represent actors appeared in one or more lms
‘Edge’ connect two actors if they have appeared together in
at least 1 feature lm
90% of actors are part of single ‘connected’ component
KBG* (225K actors in 110K lms
fi
fi

fi
)

fi

Small World Networks


Example: Spread of Infectious Disease

Type of Distributed Dynamic System


Disease spreads from a small set of initiators to a much larger
population

At time (t =0), single infective introduced into a healthy population

After 1 unit of time, infective is “removed” (dies or becomes


immune), but in that interval can infect (with some probability) each
of its neighbours
Small World Networks
Example: Spread of Infectious Disease

• Three distinct regimes of behavior :

• Low infective Diseases


• Infects Little population, then dies

• High infective Diseases


• Infects Entire population, function of ‘L’ !!

• Mid infective Diseases


• Complicated relationship between Structure and Dynamics, not
completely characterized
Modern Networks
Outline
❖ Characteristics of Modern Network
❖ Small World & Clusterin
❖ Power law Distributio
❖ Ubiquity of the Power La

❖ Deriving the Power La


❖ How does network grow
❖ The theory of preferential attachmen
❖ Variations on the them

❖ Properties of Scale Free Network


❖ Error, attack tolerance, and epidemic
❖ Implications for modern distributed system
❖ Implications for everyday system

❖ Conclusions and Open Issue


n

Characteristics of Modern Networks


❖ Most network
❖ Socia
❖ Technologica
❖ Ecologica

❖ Are characterized by bein


❖ Small worl
❖ Clustere
❖ And SCALE FREE (Power law distribution

❖ We now have to understan


❖ What is the power law distributio
❖ And how we can model it in networks
l

Regular Lattice Networks


❖ Nodes are connected in a regular neighborhoo
❖ They are usually k-regular, with a xed

number k of edges per each nod


❖ They do not exhibit the small world
characteristic
❖ The average distance between nodes grown

with the d-root of n, where n is the number of


node
❖ They do may exhibit clusterin
❖ Depending on the lattice and on the k factor,

neighbor nodes are also somehow connected


with each other
s

fi
d

Random Networks
❖ Random networks have randomly
connected edge
❖ If the number of edges is M, each node

has an average of k=M/2n edges, where


n is the number of node

❖ They can exhibit the small world


characteristic
❖ The average distance between nodes is

log(n), where n is the number of node

❖ They do not exhibit clusterin


❖ The clustering factor is about C=k/n

for large n
s

Small World Networks


❖ Watts and Strogatz (1999) propose a model for
networks “between order and chaos

❖ Such that
❖ The network exhibit the small world

characteristic, as random network


❖ And at the same exhibit relevant clustering,

as regular lattice

❖ The model is built by simpl


❖ Re-wiring at random a small percentage of

the regular edge


❖ This is enough to dramatically shorten the

average path length, without destroying


clustering

The Degree Distribution


❖ What is the degree distribution
❖ It is the way the various edges of the network “distributes” across the vertice
❖ How many edges connect the various vertices of the networ

For the previous types of network

❖ In k-regular regular lattices, the distribution degree is constan


❖ P(kr)=1 for all nodes (all nodes have the same fixed kr number of edges)

❖ In random networks, the distribution can be either constant or exponentia


❖ P(kr)=1 for all nodes (if the random network has been constructed as a k-
regular network)
❖ P(kr)=αe-βk , that is the normal “gaussian” distribution, as derived from the
fact that edges are independently added at random
?

The Power Law Distribution

❖ Most real networks, instead, follow a “power law” distribution for the
node connectivit
❖ In general term, a probability distribution is “power law” i
❖ The probability P(k) that a given variable k has a speci c valu
❖ Decreases proportionally to k power -γ , where γ is a constant valu

❖ For networks, this implies that


❖ The probability for a node to have k edges connecte
❖ Is proportional to αk-γ

P( k ) = αk −γ
y

fi
f

Power vs. Exponential Distribution


1

Is there a really
0,75
substantial
The exponential difference
P(k) 0,5 distribution
decays The power law
exponentially distribution decays as Let’s see the
0,25 polinom same
distributio
0 on a log-log
1 5 10 15 20 25 30
gure…
k
fi
a

Power vs. Exponential Distribution


The Heavy Tail
❖ The power law distribution implies an “in nite variance
❖ The “area” of “big ks” in an exponential distribution tend to zero with k!
❖ This is not true for the power law distribution, implying an in nite varianc
❖ The tail of the distribution counts!!

❖ In other words, the power law implies tha


❖ The probability to have elements very far from the average is not neglectabl
❖ The big number count

❖ Using an exponential distributio


❖ The probability for a Web page to have more than 100 incoming links, considering the

average number of links for page, would be less in the order of 1-20
❖ which contradicts the fact that we know a lot of “well linked” sites…
s

fi

fi

The Power Law in Real Networks


Average k Power law exponents
The Ubiquity of the Power Law
❖ The previous table include not only technological network
❖ Most real systems and events have a probability distribution that
❖ Does not follow the “normal” distributio
❖ And obeys to a power law distributio

❖ Examples, in addition to technological and social network


❖ The distribution of size of les in le system
❖ The distribution of network latency in the Interne
❖ The networks of protein interactions (a few protein exists that interact with a large number of other

proteins
❖ The power of earthquakes: statistical data tell us that the power of earthquakes follow a power-law

distributio
❖ The size of rivers: the size of rivers in the world is is power la
❖ The size of industries, i.e., their overall incom
❖ The richness of peopl
❖ In these examples, the exponent of the power law distribution is always around 2.

❖ The power law distribution is the “normal” distribution for complex systems (i.e., systems of interacting
autonomous components)
❖ We see later how it can be derived…
)

fi
fi
n

The 20-80 Rule


❖ It’s a common “way of saying
❖ But it has scienti c foundation
❖ For all those systems that follow a power law distributio

❖ Example
❖ 20% of the Web sites gests the 80% of the visits (actual data: 15%-85%)
❖ 20% of the Internet routers handles the 80% of the total Internet traf
❖ 20% of world industries hold the 80% of the world’s incom
❖ 20% of the world population consumes the 80% of the world’s resource
❖ 20% of the Italian population holds the 80% of the lands (that was true before the

Mussolini fascist regime, when lands re-distribution occurred


❖ 20% of the earthquakes caused the 80% of the victim
❖ 20% of the rivers in the world carry the 80% of the total sweet water
❖ 20% of the proteins handles the 80% of the most critical metabolic processe

❖ Does this derive from the power law distribution? YES!


s

fi

fi

The 20-80 Rule Unfolded


❖ The 20% of the populatio
❖ Remember the area represents the

amount of population in the distributio

❖ Get the 80% of the resource


❖ In fact, it can be found that the “amount

of resources” (i.e., the amount of links in


the network) is the integral of P(k)*k,
which is nearly linea

❖ I know you have paid attention and would


say the “25-75” rule, but remember there
are bold approximations…
r

Hubs and Connectors

❖ Scale free networks exhibit the presence of nodes tha


❖ Act as hubs, i.e., as point to which most of the other nodes connects t
❖ Act as connectors, i.e., nodes that make a great contributions in

getting great portion of the network togethe


❖ “smaller nodes” exists that act as hubs or connectors for local portion

of the networ
❖ This may have notable implications, as detailed below
k

Why “Scale-Free” Networks


❖ Why networks following a power law distribution for links are called “scale
free”
❖ Whatever the scale at which we observe the networ
❖ The network looks the same, i.e., it looks similar to itsel

❖ The overall properties of the network are preserved independently of the scal

❖ In particular
❖ If we cut off the details of a network – skipping all nodes with a limited

number of links – the network will preserve its power-law structur


❖ If we consider a sub-portion of any network, it will have the same overall

structure of the whole network


?

How do Scale Free Networks Look Like?

Web Cach
Network
e

How do Scale Free Networks Look Like?

Protei
Network
n

How do Scale Free Networks Look Like?

The Interne
Routers
t

Fractals and Scale Free Networks


❖ The nature is made up of mostly “fractal objects

❖ The fractal term derives from the fact that they have a non-integer dimensio
❖ 2-d objects have a “size” (i.e., a surface) that scales with the square of the linear size
A=kL2
❖ 3-d objects have a “size” (i.e., a volume) that scales with the cube of the linear size
V=kL3
❖ Fractal objects have a “size” that scales with some fractions of the linear size S=kLa/b

❖ Fractal objects have the property of being “self-similar” or “scale-free


❖ Their “appearance” is independent from the scale of observatio
❖ They are similar to itself independently of wheter you look at the from near and from
fa
❖ That is, they are scale-free
r

Examples of Fractals

❖ The Koch snow ak


❖ Coastal Regions & River system
❖ Lymphatic system
❖ The distribution of masses in the universe
fl
s

Scale Free Networks are Fractals?

❖ Yes, in fact
❖ They are the same at whatever dimension we observe the
❖ Also, the fact that they grow according to a power law can be considered as a sort of

fractal dimension of the network


❖ Having a look at the gures clari es the analogy
:

fi
fi

Barabasi Model
Summarizing

❖ The Barabasi-Albert model is very powerful to explain the structure of modern networks, but has
some limitation

❖ With the proper extensions (re-wiring, node aging and link costs, tness
❖ It can capture the structure of modern network
❖ The “rich get richer” phenomeno
❖ As well as “the winner takes it all phenomena
❖ In the extreme case, when tness and node re-wiring are allowed, it may happens that the network degenerates
with a single node that attracts all link (monopolistic networks

❖ Still, a proper unifying and sound model is missing


s

fi
n

fi
)

Growing Networks

❖ In general, network are not static entitie


❖ They grow, with the continuous addition of new node
❖ The Web, the Internet, acquaintances, the scienti c literature, etc
❖ Thus, edges are added in a network with tim

❖ The probability that a new node connect to another existing node may depend on the
characteristics of the existing nod
❖ This is not simply a random process of independent node addition
❖ But there could be “preferences” in adding an edge to a nod
❖ E.g.,. Google, a well known and reliable Internet router, a cool guy who knows many

girls, a famous scientist,


❖ Both of these could attract more link…

fi
s

Evolving Networks
❖ More in general
❖ Networks grows AN
❖ Network evolve

❖ The evolution may be driven by various force


❖ Connection ag

❖ Connection satisfactio

❖ What matters is that connections can change during the life of the networ
❖ Not necessarily in a random wa
❖ But following characteristics of the network

❖ Let’s start with the growing process..


e

Preferential Attachment

❖ Barabasi and Albert shows tha


❖ Making a network grow with new nodes tha
❖ Enter the network in successive time
❖ Attach preferentially to nodes that already have many link
❖ Lead to a network structure that i
❖ Small worl
❖ Clustere
❖ And Power-law: the distribution of link on the network nodes obeys to the power law distribution

❖ Let’s call this the “BA model”


d

The Preferential Attachment Algorithm


❖ Start with a limited number of initial node
m0

❖ At each time step, add a new node that has m edges that m ≤ m0
link to m existing nodes in the syste
ki
❖ When choosing the nodes to which to attach, assume a
probability ∏ for a node i proportional to the number ki
Π ( ki ) =
of links already attached to i k
∑ j
j
❖ After t time steps, the network will have n=t+m0 nodes
and M=mt edge
n = t + m0
This leads to a power law network!
M = mt

s

Probability Density for a Random Network

❖ In a random network model, each new node that attach to the network attach its edges
independently of the current situatio
❖ Thus, all the events are independen

❖ The probability for a node to have a certain number of edges attached is thus a “normal”,
exponential, distributio

❖ It can be easily found, using standard statistical methods that:

k
1 −
P (k ) = e m
m
n

Barabasi-Albert Model vs. Random Network Model


❖ See the difference for the evolution of the Barabasi-Albert model vs. the Random Network mode (from
Barabasi and Albert 2002)

Barabasi-Albert Random network


Model model for
n=80000 n=10000

Simulations The degree


performed with distribution gradually
various values of m becomes a normal one
with passing time
0

Generality of the Barabasi-Albert Model

❖ In its simplicity, the BA model captures the essential characteristics of a number of phenomen
❖ In which events determining “size” of the individuals in a networ
❖ Are not independent from each othe
❖ Leading to a power law distributio

❖ So, it can somewhat explain why the power law distribution is as ubiquitous as the normal Gaussian distributio

❖ Example
❖ Gnutella: a peer which has been there for a long time, has already collected a strong list of
acquaintances, so that any new node has higher probability of getting aware of it
❖ Rivers: the eldest and biggest a river, the more it has probability to break the path of a new
river and get its water, thus becoming even bigger
❖ Industries: the biggest an industry, the more its capability to attract clients and thus become
even bigger
❖ Earthquakes: big stresses in the earth plaques can absorb the effects of small earthquakes,
this increasing the stress further. A stress that will eventually end up in a dramatic earthquakes
❖ Richness: the rich I am, the more I can exploit my money to make new money ! “RICH GET
RICHER”
s

Additional Properties of the


Barabasi-Albert Model

❖ Characteristic Path Lengt


❖ It can be shown (but it is dif cult) that the BA model

has a length proportional to log(n)/log(log(n))


❖ Which is even shorter than in random network
❖ And which is often in accord with – but sometimes

underestimates –experimental data

❖ Clusterin
❖ There are no analytical results availabl
❖ Simulations shows that in scale-free networks the

clustering decreases with the increases of the network


orde
❖ As in random graph, although a bit les
❖ This is not in accord with experimental data!
r

fi

Problems of the
Barabasi Albert Model (1)

❖ The BA model is a nice one, but is not fully satisfactory

❖ The BA model does not give satisfactory answers with regard to


clusterin
❖ While the small world model of Watts and Strogatz does

❖ So, there must be something wrong with the model.

❖ The BA model predicts a xed exponent of 3 for the power la


❖ However, real networks shows exponents between 1 and

❖ So, there most be something wrong with the model


g

fi
.

Problems of the
Barabasi Albert Model (2)

❖ As an additional problem, is that real networks are not “completely” power la


❖ They exhibit a so called exponential cut-off
❖ After having obeyed the power-law for a large amount of
❖ For very large k, the distribution suddenly becomes exponentia
❖ The same sometimes happen fo

❖ In genera
❖ The distribution has still a “heavy tailed” is compared to standard exponential distributio
❖ However, such tail is not in nit

❖ This can be explained becaus


❖ The number of resources (i.e., of links) that an individual (i.e., a node) can sustain (i.e., can properly handled)

is often limite
❖ So, there can be no individual that can sustain any large number of resource
❖ Viceversa, there could be a minimal amount of resources a node can hav

❖ The Barabasi-Albert model does not predict this


l

fi
e

Exponential Cut-offs in Gnutella

❖ Gnutella is a network with exponential cut-off

❖ That can be easily explaine


❖ A node cannot connect to the network

without having a minimal number of


connection
❖ A node cannot sustain an excessive number

of TCP connections
s

Variations on the Barabasi-Albert Model: Non-linear Preferential Attachments

❖ One can consider non-linear models for preferential attachmen


❖ E.g. ∏(k)∝kα

❖ However, it can be shown that these models destroy the power-law nature of the
network

Variations on the Barabasi-Albert Model: Evolving Networks

❖ The problems of the BA Model may depend on the fact that networks not only grow but
also evolv
❖ The BA model does not account for evolutions following the growt

❖ Which may be indeed frequent in real networks, otherwis


❖ Google would have never replaced Altavist
❖ All new Routers in the Internet would be unimportant one
❖ A Scientist would have never the chance of becoming a highly-cited on

❖ A sound theory of evolving networks is still missin


❖ Still, we can we start from the BA model and adapt it to somehow account for network

evolutio
❖ And Obtain a bit more realistic model
n

Variations on the Barabasi-Albert Model: Edges Re-Wiring

❖ By coupling the model for node addition


❖ Adding new nodes at new time interva

❖ One can consider also mechanisms for edge re-wirin


❖ E.g., adding some edges at each time interva
❖ Some of these can be added randoml
❖ Some of these can be added based on preferential attachmen

❖ Then, it is possible to show (Albert and Barabasi, 2000


❖ That the network evolves as a power law with an exponent that can vary between 2 and

in nit
❖ This enables explaining the various exponents that are measured in real networks
fi
y

Variations on the Barabasi-Albert Model: Aging and Cost


❖ One can consider that, in real networks (Amaral
et al., 2000
❖ Link cos
❖ The cost of hosting new link increases with the

number of link
❖ E.g., for a Web site this implies adding more

computational power, for a router this means


buying a new powerful route
❖ Node Agin
❖ The possibility of hosting new links decreased

with the “age” of the nod


❖ E.g. nodes get tired or out-of-dat

❖ These two models explain the “exponential cut-


off” in power law networks
t

Variations on the Barabasi-Albert Model: Fitness


❖ One can consider that, in real network

❖ Not all nodes are equal, but some nodes “ t” better speci c network characteristic
❖ E.g. Google has a more effective algorithm for pages indexing and rankin
❖ A new scienti c paper may be indeed a breakthroug

❖ In terms of preferential attachment, this implies that


❖ The probability for a node of attracting links is proportional to some tness parameter µ
❖ See the formula belo

❖ It can be shown that the tness model for preferential attachment enables even very young
nodes to attract a lot of links
µi ki
Π ( ki ) =
∑ µ jk j
j
fi
w

fi
s

fi

fi
fi
g

Properties of Scale Free Networks


Error Tolerance
❖ Scale free networks are very robust to error
❖ If nodes randomly “break” of disconnect to the networ
❖ The structure of the network, with high probability, will not be signi cantly affected by

such error
❖ At least only a few small clusters of nodes will disconnect to the networ
❖ The average path length remains the same

Characteristi
Path Lenght
c

fi
k

Attack Tolerance
❖ Scale free networks are very sensitive to targeted attack
❖ If the most connected nodes get deliberately chosen as targets of attack
❖ The average path length of the network grows very soo
❖ It is very likely that the network will break soon into disconnected cluster
❖ Although these independent clusters still preserves some internal connection

Characteristi
Path Lenght
c

Error and Attack Tolerance: Random vs. Scale Free Networks


❖ Let us compare how these types of networks evolve in the presence of errors and attacks

For increasing, but still very limited errors/attack


For very limited The random network brea
errors/attacks, both The scale free network breaks if the errors are targeted
networks preserve the attacks!
The scale free network preserve its structure if the errors
connected structure are random

For relevant errors/attack


The random network break into very small
cluster
The scale free network do the same if the
errors are targeted attacks!
The scale free network preserve a notably
connected structure if the errors are rando
Increasing percentage of node errors/attacks

Epidemics and Percolation in Scale Free Networks (1)


❖ The percolation threshold pc determines
❖ the percentage of nodes that must be connected from a network to have the network break for a single

connected cluste
❖ Or, the (1-pc) percentage of nodes that must be disconnected to have the network break into

disconnected cluster

❖ Clearly, this is the same of sayin


❖ The percentage (1-pc) of nodes that must be immune to an infection for the infection not to become a

“giant” on

❖ In fac
❖ If the percentage (1-pc) of immune nodes are able to block the spreading of an infectio
❖ This implies that if these nodes were disconnected from the network, they would signi cantly break the

network into a set of independent cluster

❖ This understood, what can be said about epidemics in scale free networks?
t

fi
n

Epidemics and Percolation in Scale Free Networks (2)


❖ Given that a scale-free networ
❖ In the presence of even a large amount of random error
❖ Does not signi cantly break into clusters (see Figure 2

slides before

❖ This implies that the percolation threshold pc in scale free


network is practically zer
❖ There is no way to stop infections in random nodes even

when a large percentage of the population is immune to


them!!!

❖ On the other han


❖ If we are able to make immune the mostly connected node
❖ Breaking the network into independent cluster
❖ That is, if the immune nodes are not selected at random by

in the most effective wa

❖ Then, in this case, we can stop infections in a very effective


way!

fi
d

Implications for Distributed Systems: Internet Viruses and Routers’ Faults

❖ There is practically no way to break the spread of Internet viruse


❖ But by immunizing the most relevant “hub” router

❖ The structure of the Internet is very robust in the presence of router fault
❖ Several routers can fails, and they do everyday, without causing signi cant partitionings

of the networ

❖ At the same tim


❖ If very important “hub” routers fails, the whole network can suddenly become

disconnecte
❖ E.g., the destroying of World-Trade-Center routers – acting as main hubs for Europe-

America connections – on September 11


d

fi
s

Implications for Distributed Systems: Web Visibility

❖ How can we make our Web site a success


❖ We must make sure that it is connected (incoming links especially) from a relevant

number of important site


❖ Search engines, clearly, but also all our client

❖ This will increase the probability of it becoming more and more visibl

❖ We must make sure that it has “ tness


❖ What added value does it carry
❖ Can such added value increase its probability of preferential attachment

❖ However, we must always consider that random processes still play an important role
s

fi

Implications for Everyday Systems: Scale Free Networks and Trends


❖ Who decide what is in and what is “out” in music, fashion, etc.
❖ How can an industry have its products become “in”

❖ Industries spend a lot of money in trying to in uence the marke


❖ A lot of commercial advertising, a lot of “free trials”, etc
❖ Still, many new products fail and never have market success

❖ Recently, a few innovative industries have tried to study the structure of social networ
❖ And have understood that to launch a new product is important to identify the “hubs” of the social network
❖ And have this hubs act as the engine for the launch of the produc

❖ To this end, their commercial strategy conside


❖ Recruiting and paying people of the social layer they want to in uenc
❖ Send this people to discos, pubs, etc
❖ And identify the “hubs” (i.e., the smart guys that in the pub knows everybody, is friendly and has a lot of

women,
❖ After which, paying such identi ed hubs to support the product (e.g., wearing a new pair of shoes

❖ Nike did this by giving free shoes in suburbia basket camps in U


❖ Thus conquering the afro-american market

fi
.

fl
?

fl
t

Implications for Everyday Systems: Scale Free Networks and Terrorism

❖ The network of terrorism is growin


❖ And it is a social network with a scale free structur

❖ How can we destroy such network


❖ Getting unimportant nodes will not signi cantly affect the networ
❖ Getting the right nodes, i.e., the hubs (as Bin Laden) is extremely importan
❖ But it may be very dif cult to identify and get the hub
❖ In any case, even if we get the right nodes, other connected clusters will remains that will likely act

in any cas

❖ As far as breaking the information ow among terrorist


❖ This is very dif cult because of the very low percolation threshold
e

fi
fi
?

fl
g

fi
e

Conclusions and Open Issues


❖ In the modern “complex networks” theor
❖ Neither small world nor small free networks captures all essential properties of real networks (and

of real systems
❖ However, both systems capture some interesting propertie

❖ In the future, we expect


❖ More theories to emerg
❖ And more analysis on the dynamic properties of these types of network (i.e., of what happens

when there are processes running over them) to be performe

❖ This will be of great help t


❖ Better predict and engineer the networks themselves and the distributed application that have to

run over the


❖ Apply phenomena of self-organization in nature (mostly occurring in space) to complex networks

in a reliable and predictable ways


m

You might also like