Communities and Dynamics
in Social Networks
Francisco Restivo
fjr@fe.up.pt
slideshare.net/frestivo
Topics
• Graphs and social networks
• Some metrics
• Communities
• Dynamics
• Fraud
• Software
• etc
2SSIIM, 2015/09/22
Zachary karate club
Networks
• Networks are everywhere
• Social, biological, financial, etc
• Complex networks
• Communities reveal properties of networks
• Contagion
• Controversies
3SSIIM, 2015/09/22
Euler 1707 - 1783
SSIIM, 2015/09/22 4
SSIIM, 2015/09/22 5
SSIIM, 2015/09/22 6
SSIIM, 2015/09/22 7
SSIIM, 2015/09/22 8
Social networks
• Internet changed everything
• Social interactions
• Sharing
• e-Commerce
• Payments
• Digital marketing
• Political marketing
• etc
9SSIIM, 2015/09/22
SSIIM, 2015/09/22 10
Network growth
SSIIM, 2015/09/22 11
How teens communicate
SSIIM, 2015/09/22 12
SSIIM, 2015/09/22 13
SSIIM, 2015/09/22 14
SSIIM, 2015/09/22 15
Basics of graphs and networks
• G = (V, E)
• O(G) = |V| order
• S(G) = |E| size
• A adjacency matrix
• Ki degree of vertex I
• Directed/undirected
SSIIM, 2015/09/22 16
Representation of networks
• Matrixes, graphs, edge lists, etc
A B C D E
A 0 1 1 1 0
B 1 0 1 0 1
C 0 0 0 1 0
D 0 1 1 0 0
E 1 1 0 0 0
A B
A C
A D
B A
B C
B E
C D
D B
D C
E A
E B
SSIIM, 2015/09/22 17
Global metrics
• Number of vertexes 5
• Number of edges 11
• Number of components 1
• Diameter 2
• Density 0.55
SSIIM, 2015/09/22 18
Common Tasks
• Measuring “importance”
– Centrality, prestige (incoming links)
• Link prediction
• Diffusion modeling
– Epidemiological
• Clustering
– Blockmodeling, Girvan-Newman, Chinese whisper
• Structure analysis
– Motifs, Isomorphisms, etc.
• Visualization/Privacy/etc.
from Eytan Adar slides
19SSIIM, 2015/09/22
Centrality Measures
• Degree centrality
– Edges per node (the more, the more important the node)
• Closeness centrality
– How close the node is to every other node
• Betweenness centrality
– How many shortest paths go through the edge node
(communication metaphor)
• Information centrality
– All paths to other nodes weighted by path length
• Bibliometric + Internet style
– PageRank
from Eytan Adar slides
20SSIIM, 2015/09/22
Champions league Pagerank
SSIIM, 2015/09/22 21
SSIIM, 2015/09/22 22
SSIIM, 2015/09/22 23
Community detection
• Communities and clusters are different
• Network data is related to graph properties
• Real world data is big
SSIIM, 2015/09/22 24
Community detection algorithms
• Clauset-Newman-Moore
• Wakita-Tsurumi
• Girvan-Newman
• Chinese whispers
• Link communities
• etc
SSIIM, 2015/09/22 25
Modularity
• Compares number of edges with number of
edges of a random network
• Maximize Q is NP-hard
SSIIM, 2015/09/22 26




 



 

j
g,
i
g
ij
ij
P
ij
A
m2
1
Q
m2
j
k
i
k
ij
P
Clauset-Newman-Moore
A hierarchical agglomeration algorithm for detecting community
structure which is faster than many competing algorithms.
Its running time on a network with n vertices and m edges is
O(md log n) where d is the depth of the dendrogram describing the
community structure.
SSIIM, 2015/09/22 27
SSIIM, 2015/09/22 28
Wakita-Tsurumi
CNM algorithm does not scale well and its use is practically limited to
networks whose sizes are up to 500,000 nodes.
A simple heuristics that attempts to merge community structures in a
balanced manner can dramatically improve community structure
analysis.
SSIIM, 2015/09/22 29
SSIIM, 2015/09/22 30
Girvan-Newman
A property that is found in many networks, the property of community
structure, in which network nodes are joined together in tightly knit
groups, between which there are only looser connections.
We propose a method for detecting such communities, built around
the idea of using centrality indices to find community boundaries.
SSIIM, 2015/09/22 31
SSIIM, 2015/09/22 32
Chinese whispers [Biemann]
• a
Randomized graph-clustering algorithm, which is time-linear in the
number of edges.
It can be viewed as a simulation of an agent-based social network.
SSIIM, 2015/09/22 33
Link communities [Ahn et al]
Communities in networks often overlapsuch that nodes
simultaneously belong to several groups.
Meanwhile, many networks are known to possess hierarchical
organization, where communities are recursively grouped into a
hierarchical structure.
SSIIM, 2015/09/22 34
Dynamics
• Networks have a temporal dimension
• Interactions – follow, like, share, mention,
retweet, hashtag, etc – occur in sequence
• Network properties evolve in time
SSIIM, 2015/09/22 35
Impact of bots
• The use of bots is increasing
• In Twitter, one in 20 active accounts are fake
• In Facebook, one in 100 active accounts is
estimated to be fake
• Better auditing algorithms are needed
SSIIM, 2015/09/22 36
SSIIM, 2015/09/22 37
SSIIM, 2015/09/22 38
Controversies
SSIIM, 2015/09/22 39
Software Tools
• NodeXL
• Gephi
• NetworkX
• Meerkat
• netvizz
• d3.js
• API
SSIIM, 2015/09/22 40
SSIIM, 2015/09/22 41
SSIIM, 2015/09/22 42
SSIIM, 2015/09/22 43
SSIIM, 2015/09/22 44
Datasets
I keep my collection here
https://sites.google.com/site/frestivo/networked-life/databases
There is another in Quora
Where can I find large datasets open to the public?
My collection of papers: http://tinyurl.com/qzjp6rg
SSIIM, 2015/09/22 45
Thank you!
SSIIM, 2015/09/22 46

Communities and dynamics in social networks