Social Networks
Social Networks
Weekly Breakdown:
● Week 1: Introduction
● Week 2: Handling Real-world Network Datasets
● Week 3: Strength of Weak Ties
● Week 4: Strong and Weak Relationships (Continued) & Homophily
Further exploration of tie strength and the concept of homophily (people connecting
due to similarity).
engineerstudyhub.in
Week 1: Introduction
Overview of social network
A social network is a structure made up of individuals or organizations (called nodes)
connected by one or more specific types of interdependency, such as friendship,
communication, or influence (called edges or links).
Soc ial network analysis (SNA) is a method used to study these relationships and
understand how information, behaviors, or trends spread within a network.
Key concepts
SNA is a broad topic, but these are some of the essential terms, concepts, and theories
you need to know to understand how it works. In Social Network Analysis (SNA), the
network is represented using a graph, which consists of two main components: nodes
and edges.
1. Nodes (Vertices):
● For example:
● For example:
engineerstudyhub.in
○ A follower link on Twitter (directed edge).
Example:
● Nodes: A, B, C
● Edges:
○ A is friends with B
○ B is friends with C
Nodes and edges are the fundamental building blocks of social networks. They help in
visualizing and analyzing relationships and interactions within any social system.
engineerstudyhub.in
Network types
In Social Network Analysis, a network is a collection of nodes (also called vertices) and
edges (also called links) that represent entities and the relationships or interactions
between them.
Types of Networks
Networks can be classified into different types based on direction, weight, and structure:
engineerstudyhub.in
○ Example: A → B means A follows B.
Networks and their types help researchers understand various forms of relationships
and behaviors in social, technological, and biological systems. Choosing the right type of
network is crucial for effective analysis.
engineerstudyhub.in
Week 2: Handling Real-world Network
Datasets
Handling Real-world Network Datasets
Introduction to Dataset
❖ Real-world network datasets are crucial for understanding the structure and
dynamics of various interconnected systems in the world.
❖ These datasets represent entities as nodes (vertices) and relationships between
them as edges (links).
❖ The analysis of such datasets is essential for applications in fields like social
media analysis, biological networks, transportation, and more.
❖ Real-world networks can range from simple connections like social media
followers to complex biological interactions.
Ingredient Network
This type of network helps understand food pairings, common ingredient combinations,
and trends in culinary preferences.
Key features:
● Nodes: Ingredients
Synonym Network
A synonym network consists of words or phrases that are related through synonyms. In
this network, nodes represent words, and edges represent the synonym relationship
between them.
A synonym network helps understand language structure and is useful in NLP tasks like
text summarization, machine translation, and sentiment analysis.
Key features:
Web Graph
● A web graph shows how web pages (nodes) are linked through hyperlinks (edges).
● It's used in SEO, web crawling, and ranking algorithms like Google's PageRank.
● A web graph connects web pages using hyperlinks, showing how they're linked.
● It's helpful for search engines to crawl, rank pages, and improve SEO.
● It's used to understand website connections and improve search ranking.
Key features:
In this network, nodes represent people or entities, and edges represent social
connections such as friendships, follows, or collaborations.
Social networks are used in applications such as social media analysis, recommendation
systems, and influence modeling.
Key features:
Datasets: Different Formats
● JSON (JavaScript Object Notation): A lightweight data format used for storing
data in key-value pairs.
● Excel (XLS/XLSX): A spreadsheet format for tabular data with more advanced
features.
Each format has its uses depending on the data and how it's processed.
Connectedness:
● A connected network means there is a path (direct or indirect) between any two
nodes.
Emergence:
● "Emergence" refers to how a complex structure or pattern arises from simpler
individual interactions.
Importance:
Connectedness is how things link together to form a system. Here's a short summary:
1. Social Networks: People or groups are connected, forming networks with a few
important nodes (influencers) and many less connected ones.
2. Biological Networks: In nature, things like cells or animals are connected to help
them survive, often without any leader.
3. Technology Networks: The internet connects billions of devices, with key hubs
like popular websites.
4. Mathematics: Graph theory helps study how things are connected and how
networks grow.
❖ He explained how weak ties (casual acquaintances) can be more helpful than
strong ties (close friends and family) in some situations.
❖ Weak ties act as bridges between different social groups and help spread new
information.
❖ People with only strong ties are often part of the same circle, so they usually know
the same things.
❖ But weak ties connect us to new people and new opportunities, like jobs, ideas,
or resources.
❖ For example, most people find jobs through weak ties, not their closest friends.
❖ Granovetter showed that information travels faster and wider through weak
ties.
❖ This idea is now used in networking, social media, marketing, and job hunting.
Triads, clustering coefficient and neighborhood overlap
Triads
Clustering Coefficient
● The clustering coefficient measures how tightly-knit a node’s friends are.
● It shows the likelihood that two friends of a person are also friends with each
other.
● Mathematically:
● A high clustering coefficient means a tightly connected group (like close friend
circles).
Neighborhood Overlap
● High overlap suggests the connection is in a tight community, while low overlap
may indicate a bridge between different communities (like a weak tie).
2. Bridges
○ A bridge is a connection (tie) between two nodes (people) that connect
two otherwise separate groups.
○ Bridges are essential for the flow of new information across a network.
○ A local bridge is a tie where the two connected individuals have no mutual
friends.
○ Local bridges are usually weak ties, and they play a crucial role in finding
new jobs, opportunities, or knowledge.
Validation of Granovetter's Theory Using Cell Phone Data
● Researchers used mobile phone call records from millions of users to analyze
social networks at a large scale.
● Each call and text created a link (tie) between users — with frequency and
duration indicating tie strength.
● The researchers studied the relationship between tie strength and the diversity
of contacts.
● They found that people who communicated more frequently (strong ties) often
had similar social circles.
● These weak ties helped spread information more broadly across the network —
just like Granovetter had predicted.
● One famous study by Onnela et al. (2007) used mobile phone data and showed
that removing weak ties caused a large drop in overall network connectivity.
Embeddedness
● Embedded ties often offer strong trust and support but may lack new or diverse
information.
Structural Holes (by Ronald Burt)
● A structural hole is a gap between two social groups that are not directly
connected.
● A person who connects two disconnected groups acts as a bridge across the
structural hole.
● This position gives them information and control advantage, because they can
access non-redundant information from both sides.
● People who span structural holes are often innovators, leaders, or influencers
because they see things others can’t.
Social Capital
● Social capital refers to the resources and benefits people get from their social
relationships.
● Social capital is built through networks, norms, and trust that facilitate
cooperation.
● Strong ties offer emotional support; weak ties and bridges offer access to new
information and opportunities — both types contribute to social capital.
Tie Strength
● Tie strength refers to how close and active a relationship is between two people.
● Weak ties: Acquaintances, old colleagues – less frequent interaction, but broader
reach.
● Tie strength affects how information, support, and influence spread in networks.
Social Media and Tie Strength
● Social media platforms like Facebook, Instagram, Twitter, and LinkedIn allow us
to maintain both strong and weak ties.
● You interact more deeply with strong ties (likes, comments, DMs).
● But you stay connected with weak ties through occasional updates – and these
ties are often useful for new opportunities (jobs, trends, ideas).
● Social media blurs the line between strong and weak ties – people can quickly
reconnect or build new ties.
Passive Engagement
● It’s a common way to maintain weak ties – even without direct messaging or
commenting.
● Passive engagement helps people stay updated about others’ lives, which keeps
weak ties alive and relevant.
● Studies show even passive interaction can influence emotions, social comparison,
and information flow.
Betweenness Measures
● Betweenness centrality measures how often a node (person) lies on the shortest
path between other nodes in a network.
● A node with high betweenness acts like a bridge or connector between different
parts of the network.
● Such nodes often have influence or control over information flow because they
connect otherwise separate groups.
● Use case: Helps find key influencers, network bottlenecks, or important
connectors in social networks, transport, or communication systems.
Graph Partitioning
● Popular methods:
● In graph theory, a community is a set of nodes within a graph that are more
densely connected to each other than to nodes outside the community.
● Identifying communities helps in analyzing complex networks like social media,
biological networks, and organizational structures.
● The Brute Force Method is one of the simplest techniques to detect communities
in a graph, though it may not always be the most efficient for large graphs.
○ For an undirected graph with n nodes, there are 2^n subsets (including the
empty set and the entire set of nodes).
○ The more edges that exist between nodes inside the subset, the stronger
the community.
○ For each subset, evaluate the external connectivity, which is the number of
edges that connect nodes in the subset to nodes outside it.
○ A good community will have many internal edges and few external edges.
○ Among all subsets, identify those that have high modularity (i.e., high
internal connectivity and low external connectivity).
○ By removing these edges, the graph will naturally split into separate
components, which correspond to different communities.
○ After calculating the betweenness centrality for all edges, remove the edge
with the highest betweenness.
○ This edge is assumed to be the most significant link between two different
communities.
○ After removing the edge, recalculate the betweenness centrality for all
remaining edges, as the removal of one edge might change the shortest
paths.
Homophily is the tendency of individuals to associate and bond with others who are
similar to themselves.
The phrase "birds of a feather flock together" captures this idea.
Types of Homophily:
1. Status Homophily: Based on social status (age, gender, education, religion, etc.)
2. Value Homophily: Based on shared beliefs, attitudes, or values.
Effects of Homophily:
◆ Yes—being surrounded only by similar people can restrict your growth and exposure
to new ideas.
◆ Diverse networks (including weak ties and people from different backgrounds) offer
new perspectives and opportunities.
◆ It’s important to balance comfort with diversity in your social and professional
circles.
Selection and Social Influence
Selection
● Example: A student who enjoys studying may befriend others who also study
seriously.
Social Influence
● Over time, individuals in a group may adopt similar habits, attitudes, or beliefs.
● Example: A person might start exercising regularly if their friends are into fitness.
Interplay between Selection and Social Influence
● Homophily is the tendency of individuals to form ties with others who are similar
to themselves.
Types of Homophily
Measurement of Homophily:
● Assortativity Coefficient:
○ Measures the tendency of nodes to connect with others of the same type
or attribute.
● Foci closure refers to the idea that people form connections because they share a
common focus (or activity/place).
● A focus is any shared context like a school, workplace, club, gym, or online group.
● If person A and person B both go to the same gym (focus), they are more likely to
become friends.
● Two parents who meet at their child’s school may become friends because the
school acts as the common focus.
Membership Closure
Fatman Model in Social Networks
The Fatman Model was introduced by Alain Barrat, Marc Barthélemy, and Alessandro
Vespignani in 2004.
It explains the growth and evolution of social networks by considering two main
factors:
2. Fitness
○ Nodes with higher fitness values are more likely to receive more
connections over time.
Fat-Tailed Distribution
● The model is named the Fatman Model because the fitness values follow a
fat-tailed distribution.
● In this distribution:
Applications
○ Transportation networks
● It has shown that networks with a fat-tailed fitness distribution are more robust
and realistic compared to networks with uniform distributions.
Flow of Fatman Evolutionary Model
○ Run the model over time to simulate the network’s natural growth and
changes.
7. Repeat:
The Fatman Model effectively captures the dual impact of popularity (preferential
attachment) and personal attributes (fitness) on social network growth. It provides
insights into how structure, robustness, and inequalities emerge in real-world social
systems.
● To identify triads, check if two nodes (A and B) are connected to a common third
node (C).
● This can be done by finding all pairs of neighbors that share a common neighbor.
● A simple triadic closure index can be defined as the ratio of actual closed triads to
the total number of possible triads in a network.
Triadic Closure Index (TCI) = (Number of closed triads) / (Total number of
possible triads)
● A triad is closed when all three nodes (A, B, and C) are directly connected to each
other.
Week 5: +Ve / -Ve Relationships
Spatial Segregation: Simulation of the Schelling Model
❖ Schelling Model is a simple simulation that shows how individual preferences can
lead to segregation in society.
❖ Each person (or agent) wants to live near people similar to them (based on religion,
community, etc.).
❖ Even with a small preference for similarity, the result can be large-scale segregation
over time.
❖ The simulation takes place on a grid where agents of different types are randomly
placed.
❖ An agent checks its neighborhood — if it’s unhappy (not enough similar neighbors),
it moves to a new empty spot.
❖ This process repeats until most or all agents are satisfied with their neighborhood.
❖ Over time, this leads to clustering — similar agents group together, creating
segregated zones.
❖ The model shows how local individual choices can lead to unintentional large-scale
separation.
Negative Relationships:
Structural Balance
● Definition:
Structural balance is a concept from social network theory that examines the
stability of relationships in a network, especially in triads (groups of three nodes).
● Basic Idea:
It focuses on whether the pattern of positive (friendship) and negative (hostility)
relationships in a triad creates harmony or tension.
● Balanced Triads:
A triad is considered balanced if:
Statement:
A complete signed graph is said to be balanced if and only if:
All edges in the graph are positive, or The set of nodes can be divided into two mutually
hostile groups such that:
1. Consider a complete signed graph where all triangles (triads) are balanced.
2. According to balance theory, only the triads with signs (+ + +) or (+ – –) are
balanced.
3. Pick any node A and classify the rest of the nodes based on their relationship with
A:
5. Hence, the graph satisfies the condition of being divided into two friendly groups
with negative relationships between them.
(⇐) If the graph satisfies the above condition, then it is balanced:
2. Any triangle (triad) in the graph will fall into one of two categories:
○ All nodes from one group → edges are all positive → triad is balanced.
○ Two nodes from one group and one from the other → edges form a (+ – –)
pattern → also balanced.
Therefore, a complete signed graph is balanced if and only if it satisfies the conditions
stated in the balance theorem.
In signed graphs, each edge between nodes carries a sign—either positive (+) or
negative (–)—to represent the type of relationship between the connected nodes.
● They are essential in analyzing structural balance, conflict resolution, and group
dynamics
Example:
● If two edges are negative (–) and one is positive (+): Still considered structurally
balanced under balance theory.
Week 6: Link Analysis
The Web Graph
The Web Graph is a directed graph that represents the structure of the World Wide Web.
In this graph:
● Edges (or links) represent hyperlinks from one web page to another.
This graph is very large, complex, and dynamic, and is crucial in understanding how web
pages are connected, helping in search engine indexing, ranking algorithms, and web
crawling.
Problem Definition:
Given an array where each element represents the number of coins a person has, the goal
is to redistribute the coins such that every person has the same number of coins.
Example:
Equal distribution is only possible if the total number of coins is divisible by the
number of people.
● Each person or node passes a coin to a randomly chosen neighbor at each step.
3. Model Explanation:
● Initial Setup: A set of nodes, each having some coins (possibly uneven).
● Random Process: At each time unit, a node gives a coin to one of its neighbors
chosen uniformly at random.
The core idea is: a page is important if it is linked to by other important pages.
The Web Graph helps model the entire web as a directed graph. In this graph:
● The PageRank algorithm computes a ranking score for each page using the
structure of this graph.
● Tᵢ = pages linking to A
4. Steps Involved:
DegreeRank
Definition:
Types:
Formula:
● Simple to calculate.
Limitations:
PageRank
Definition:
Formula:
Where:
● NN = total pages
● Ti = pages linking to A
Advantages:
● Considers both quantity and quality of links.
Limitations:
DegreeRank PageRank
Low High
No Yes
Week 7: Cascading Behaviour in
Networks
Diffusion in Networks
Diffusion in networks refers to the process by which something (information,
influence, disease, innovation, or behavior) spreads through the nodes and edges of a
network. It models how entities interact and propagate effects through a connected
structure.
Types of Diffusion:
Modeling Diffusion
Common Models for Modeling Diffusion:
● Each active node gets one chance to activate each of its inactive neighbors with a
certain probability (p).
● A node becomes active when the total influence from its active neighbors exceeds
its threshold.
3. Epidemic Models:
● Nodes that connect different communities (called bridge nodes) play a key role in
spreading diffusion across communities.
● Highly modular networks with strong community divisions may limit the overall
reach of diffusion.
● For example, a viral tweet may spread quickly among students but take time to
reach professionals unless someone shares it across communities.
1. A cascade occurs when a small initial action (like one node becoming active)
triggers a chain reaction in which many other nodes become active.
2. Cascades spread across the network as each node influences others to also
activate.
3. The effectiveness of a cascade depends on the network structure and the
strength of connections between nodes.
4. Cascades can be self-propagating, where one event leads to another, eventually
impacting a large portion of the network.
5. External triggers or initial seed nodes are crucial in starting a cascade.
6. Cascades can sometimes fail if the network structure doesn’t allow enough
influence to spread.
7. Example: A viral marketing campaign where a single influential user’s post
encourages many others to share it.
Clusters
1. Clusters are groups of tightly connected nodes in a network, where most nodes
are directly or indirectly connected to each other.
2. Clusters are often formed based on similar characteristics or shared interests,
creating communities within the larger network.
3. In a clustered network, nodes are more likely to interact with others in the same
cluster than with nodes in different clusters.
4. Clusters can act as barriers to diffusion, especially if they have fewer connections
to other clusters (i.e., weak inter-cluster connections).
5. A strong cluster can resist external influence, making it difficult for information
or behaviors to spread beyond the cluster.
6. The size and density of a cluster affect how easily diffusion can move across it.
7. Example: In social media, users who share similar interests form clusters, and
trends within one cluster may not immediately spread to other clusters without
key influencers.
2. The spread of knowledge can happen through interactions between nodes or
through media, like social networks or educational platforms.
4. In some cases, specialized knowledge might only be accessible to certain nodes,
creating a knowledge gap in the network.
5. Example: A new research paper can spread through academic networks, where
experts share and discuss the findings.
6. Knowledge diffusion is often non-linear, meaning it may not spread evenly across
the network.
7. The spread of knowledge can be accelerated by influential nodes (e.g., experts or
thought leaders) in the network.
Thresholds
1. Thresholds refer to the minimum level of influence a node requires from its
neighbors to take action or adopt a behavior.
2. High thresholds mean a node requires a lot of influence from others to adopt
something, while low thresholds mean it needs less influence.
3. Nodes with low thresholds are more likely to adopt behaviors or spread
information quickly, while those with high thresholds may resist adoption until a
critical mass of neighbors adopts.
4. The average threshold in a network impacts the speed and extent of diffusion
across the network.
5. Thresholds are often used in social influence models like the Linear Threshold
Model (LTM).
6. Example: A person might need to see at least five of their friends using a new app
(threshold) before deciding to download it.
7. Network effects can amplify thresholds, where the adoption behavior of others
increases the likelihood of a node adopting.
Collective Action
1. Collective action refers to the efforts of multiple individuals in a network coming
together to achieve a common goal.
3. The success of collective action is influenced by incentives, social influence, and
group dynamics.
4. Free rider problems can emerge, where some individuals benefit from the actions
of others without contributing themselves.
5. In social networks, collective action is often triggered by shared interests, like a
social cause, political movement, or protest.
6. Example: A group of users in a social media campaign work together to spread
awareness about an environmental issue.
7. The critical mass of participants is essential for collective action to succeed, as it
generates the momentum needed to drive change.
Week 8: Link Analysis (Continued)
Hubs and Authorities
Link Analysis involves analyzing the structure of links between nodes to identify
important hubs and authorities in a network, which is particularly useful in search
engine ranking and recommendation systems.
Hubs are nodes (web pages, individuals, etc.) that have a large number of outgoing links
to other nodes (web pages or resources).
● Example: A webpage that links to many other pages within a topic or domain.
Authorities are nodes that receive many incoming links from hubs, indicating that they
are considered important or authoritative on a particular subject.
● Conservation refers to the idea that the total PageRank score across all pages in a
network remains constant or conserved.
● Each time a link is passed from one page to another, the PageRank is distributed
across the outgoing links.
● In the steady state, the total sum of all PageRank values across all pages is equal
to the initial sum, which is typically 1 (if normalized).
Convergence in PageRank:
● Convergence refers to the process where, after several iterations, the PageRank
scores stabilize.
● Initially, PageRank scores are assigned randomly or equally, but after applying the
algorithm iteratively, the scores converge to a final set of values.
● Convergence is reached when the PageRank values no longer change significantly
between iterations.
When we multiply a vector by a matrix repeatedly, we might notice that the result starts
to stay the same after some time. This means the vector settles into a steady state. This
steady state is called convergence.
1. The matrix A is stochastic (it has probabilities and its columns add up to 1).
2. The matrix is irreducible, meaning you can get from any state (or page) to any
other state (or page).
3. The matrix is aperiodic, meaning there’s no fixed cycle or repeating pattern.
Example: PageRank
In PageRank (the algorithm Google uses to rank web pages), the web is represented as a
big matrix where each page is connected to other pages via links. We repeatedly multiply
a "rank vector" by this matrix, and the vector will eventually converge to a steady state.
● This steady state tells us how important each page is in the web. Once the vector
stops changing, we have the final PageRank for each page.
1. Start with an initial guess, like all pages having the same rank.
2. Multiply by the matrix repeatedly (this is like the random surfer moving around
the web).
3. After some iterations, the ranks stabilize, and that’s when we have convergence.
In simple terms, after enough multiplications, the system stops changing and gives us
the final result.
Why It Works
When you keep multiplying, the vector eventually "learns" how the web is connected
and stabilizes to a final set of values, which tells you the importance (or PageRank) of
each page. This is the converged vector.
● PageRank is a famous algorithm used by Google to rank web pages based on their
importance.
● It uses a graph model where web pages are represented as nodes and hyperlinks
as directed edges between these nodes.
● The PageRank algorithm can be described as a matrix operation, where the
matrix represents the structure of the web.
In the PageRank algorithm, the web is represented as a directed graph where each web
page is a node, and each hyperlink is a directed edge.
The link structure of the web can be represented by an adjacency matrix A of size n×n,
where n is the number of web pages. In this matrix:
However, a simple adjacency matrix may not be enough to apply PageRank directly. We
need to transform this matrix into a form that can be used in the PageRank calculation.
Week 9: Power Laws and
Rich-Get-Richer Phenomena
Introduction to Power Law
In network terms:
A few nodes (people, websites, etc.) have lots of connections, while most nodes have only a
few.
🔸 Mathematical Form:
P(k)∝k−γP(k) \propto k^{-\gamma}P(k)∝k−γ
Where:
🔍 Real-life Examples:
● A few websites (like Google, Facebook) get millions of visits, while most get very few.
● In social networks, some users have thousands of followers, but most have only a few.
📊 In Network Graphs:
● If you plot number of nodes vs. their degree (number of connections) on a log-log scale, a power
law appears as a straight line.
📌 Summary:
● A power law shows uneven distribution — few with a lot, many with little.
🔸 Example:
Imagine you:
● Roll it 100 times and take the average → that average will start to look like a normal distribution.
Same with:
● Heights of people
● Test scores
● Measurement errors
● IQ scores
These are all sums or averages of many tiny factors (genes, environment, skill, chance...), so they
naturally form a bell curve.
Biological & social traits Traits like height, weight, IQ come from many small causes
📊 Shape:
The graph of a normal distribution is symmetric and bell-shaped, centered around the mean (average),
with most values close to the mean and fewer as you move away.
📌 Summary:
● Normal distributions appear because many random factors add up.
● Thanks to the Central Limit Theorem, the result often looks like a bell curve.
Over time, the web grows in a way that follows a power law distribution:
A few websites get millions of links, while most get only a few.
For example:
● A new blog is more likely to link to Google, Wikipedia, or YouTube than a random unknown site
● The more links a website already has, the more likely it is to get new ones.
This creates a rich-get-richer effect, and over time it forms a power law distribution.
● The graph of this data on a log-log scale becomes a straight line, showing a power law.
✅ Real-World Example:
Website Approx. Inbound Links
Google Millions
Wikipedia Millions
📌 Summary:
Aspect Explanation
Link Distribution Few pages with lots of links, many with few
● For a network (like WWW, social network), count how many connections (degree) each node has.
2. Plot Degree Distribution
● Re-plot the same data using a log-log scale (log x-axis and log y-axis).
📈 If the points form a straight line on this plot, your data likely follows a power law.
● Use tests like the Kolmogorov–Smirnov (K-S) test to check how well the data fits a power law.
● Compare with other models (exponential, log-normal) to confirm it's not something else.
🧠 Tools You Can Use:
Tool/Library Purpose
Things that already have a lot, tend to get even more over time.
In network terms:
Nodes (like websites or people) that already have many links or connections are more likely
to get even more connections.
📈 Also Called:
● Preferential Attachment
● Cumulative Advantage
4. This feedback loop leads to some nodes becoming very rich in connections
Week 10: Power law (contd..) and
Epidemics
Rich Get Richer - A Possible Reason
This is called Preferential Attachment, a key idea behind the Barabási–Albert model of network growth.
🧠 Simple Example:
● Imagine a new user joins a social network.
📊 Real-Life Examples:
Domain Rich Get Richer Example
📌 Conclusion:
The “Rich Get Richer” effect is a natural result of how humans behave in networks — we tend to follow, link to, or
trust things that are already popular.
Epidemics - An Introduction
🧠 In Real Life:
● A virus spreads when infected people meet healthy ones.
So epidemics don’t just happen with diseases, they also apply to information, trends, and technology.
🔗 In Network Science:
● People are nodes.
● The structure of the network affects how fast and how far an epidemic spreads.
🔄 Epidemic Spread Models (Basics):
1. SIR Model – People move through three stages:
● Network density
✅ Summary:
Term Meaning
3. Each of them infects more people (generation 2), and so on...
Example:
Let’s say each person infects 2 people on average (R = 2):
Gen 0: 1 person
Gen 1: 2 people
Gen 2: 4 people
Gen 3: 8 people
Total = 1 + 2 + 4 + 8 = 15 people infected
If R = 0.5 (each person infects only 0.5 people on average), it might look like:
Gen 0: 1
Gen 1: 1
Gen 2: 0
👉 On average, how many people one infected person will infect in a fully healthy population.
✅ Why is R₀ important?
It helps us predict whether a disease will spread, slow down, or stop.
If R₀ = 0.5:
✅ Summary:
Term Meaning
👥 Population Groups:
● S (Susceptible): Healthy people who can get infected
● I (Infected): People who have the disease and can spread it
● R (Recovered): People who recovered or died — they don’t spread or catch the disease again
🔁 Flow:
S→I→R
✅ Used for:
● Diseases like measles, COVID-19, where people usually don’t get re-infected once
recovered.
👥 Population Groups:
● S (Susceptible)
● I (Infected)
🔁 Flow:
S → I → S (again)
After recovering, people go back to being susceptible, and can get infected again.
✅ Used for:
● Diseases like common cold, flu, or STDs, where people can get the disease multiple
times.
📊 Quick Comparison:
Feature SIR Model SIS Model
Groups S, I, R S, I
📌 Summary:
● SIR: Once recovered, you're safe
📊 Quick Comparison:
Feature SIR Model SIS Model
Immunity Yes (Recovered individuals are immune) No (Individuals can get reinfected)
Reinfection No Yes
💬 “How likely is it that the spread can reach a large part of the network?”
● If the sponge has many open holes, water flows through easily.
● If too many holes are blocked, water can’t pass through — it stops.
In a similar way:
● If not enough people are connected → the spread stops or slows down.
🔁 In Networks:
● Nodes = people (or devices, pages, etc.)
● Percolation threshold = the minimum number of connections needed for the network to be “connected”
enough for large-scale spreading.
✅ Key Terms:
Term Meaning
📌 Example:
● If only 30% of people are connected → maybe not enough to spread COVID-19.
🧪 What He Did:
● He gave letters to random people in the U.S.
● They had to forward the letter to a specific target person (a stockbroker in Boston)…
📊 What He Found:
● On average, it took 6 steps (6 people) to reach the target.
● This led to the idea of “6 degrees of separation” — we are all connected through a short chain of
people.
“How small is the world, really? Can people find short paths to each other in a large network?”
His goal was to understand how social connections work, and whether people could navigate these networks
without knowing the full map.
● Short paths: You can reach any person in just a few steps.
This model shows how a network can be small, even when it’s large in size
🌐 Decentralized Search
This means finding a target in a network using only local information — like Milgram’s participants.
People didn’t have a map of the full network — they only knew who they were connected to, and made
decisions based on that.
● Helps design efficient algorithms for search, routing, or message delivery in networks like social media or
the Internet.
📌 Summary Table:
Topic Meaning
Milgram’s Experiment Showed that people are connected by short chains (~6 steps)
Decentralized Search Finding paths using local knowledge (not full network map)
Week 12: Pseudocore (How to Go Viral
on the Web)
🌐 Small World Networks: Introduction
A Small World Network is a type of network where:
● You can reach any node from any other in a few steps.
📌 Key Features:
● High clustering: Friends of your friends are likely your friends.
● Short average path: You can reach distant parts of the network in a few hops.
🔍 Myopic Search
Myopic Search is a greedy, local search method in a network.
🔧 How it works:
● A person/node doesn’t know the full network.
● It passes the message to the neighbor who seems closest to the target.
📦 Like delivering a letter by only asking your friends to forward it to someone they think is closer to
the recipient.
Efficiency May not always find the shortest path Always finds the shortest path
Milgram’s experiment (6 degrees of separation) showed this: even with local decisions, people can
reach targets quickly.
🔹 PseudoCores: Introduction
In large networks, core nodes are very well connected and help in spreading messages fast.
➕ Enter PseudoCores:
● These are clusters of nodes that behave like cores but aren’t centrally located.
● Highly connected
🔸 Pseudo Core
● Not the actual central core, but acts like it in message spreading.
🧩 Summary Table:
Concept Description
Myopic vs Optimal Myopic = realistic; Optimal = shortest but needs full info
Key Nodes Nodes critical for flow; found using centrality measures