0% found this document useful (0 votes)
955 views

Coursera Algorithms On Graphs

This document provides resources for learning data structures and algorithms including: - Websites with tutorials and visualizations like UPC, USFCA, and OpenDSA. - Books on data structures, algorithms, and problem solving. - YouTube channels covering topics like data structures, dynamic programming, coding interviews. - Online courses on data structures, algorithms, and coding interviews from platforms like Udemy, Udacity, Coursera, Stanford, and Princeton. - Practice platforms like LeetCode, HackerRank, and TopCoder. It recommends learning the basics, then practicing with problems on these platforms while referring to books, tutorials, and videos for specific topics like

Uploaded by

yousef shaban
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
955 views

Coursera Algorithms On Graphs

This document provides resources for learning data structures and algorithms including: - Websites with tutorials and visualizations like UPC, USFCA, and OpenDSA. - Books on data structures, algorithms, and problem solving. - YouTube channels covering topics like data structures, dynamic programming, coding interviews. - Online courses on data structures, algorithms, and coding interviews from platforms like Udemy, Udacity, Coursera, Stanford, and Princeton. - Practice platforms like LeetCode, HackerRank, and TopCoder. It recommends learning the basics, then practicing with problems on these platforms while referring to books, tutorials, and videos for specific topics like

Uploaded by

yousef shaban
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 312

https://www.cs.upc.

edu/~jordicf/Teaching/AP2/

https://www.cs.usfca.edu/~galles/visualization/DFS.html

very very important Book:

https://opendsa-server.cs.vt.edu/ODSA/Books/CS3/html/

https://opendsa-server.cs.vt.edu/ODSA/Books/

https://opendsa-server.cs.vt.edu/

Youtube channel:

‫ محمد شوشان‬Data Structure (Very important)

Dynamic Programming
Vivekanand Khyade - Algorithm Every Day

Udemy:

Master the Coding Interview: Data Structures + Algorithms

Mastering Data Structures & Algorithms using C and C++

Data Structure and Algorithms Analysis - Job Interview

Advanced Algorithms in Java

Data Structures and Algorithms: Deep Dive Using Java

Data Structure: Hashing


Salem Alelyani

New Baghdad Arabic

KMR script Arabic new


Logrithmatic Time

Udacity >> Stanford >> principton

https://backtobackswe.com/pricing

https://leetcode.com/problems/sudoku-solver/

Algorithms and Data Structures - Part 1 pluralsight.com

https://www.topcoder.com/community/competitive-programming/tutorials

Algorithms with Attitude (Youtube Channel)

zooce (Youtube Channel)

Salem Alelyani (Youtube channel) Very important

faad coder ( Youtube Channel)

ahmed kamal (Data Structures Course (Arabic))

Michael Muinos (Youtube Coding interview problemss)

Jenny's lectures CS/IT NET&JRF (Youtube Channekl)

RobEdwards ( Youtube Channel)

WilliamFiset ( Youtube Channel) very important

Graph theory playlist

Adel Nasim ( Youtube Channel)

new Baghdad

KMR Script

code masry
https://www.cs.usfca.edu/~galles/visualization/QueueArray.html

https://www.cs.usfca.edu/~galles/visualization/QueueArray.html

http://dm.compsciclub.ru/app/quiz-touch-all-segments

Working with Graph Algorithms in Python

Arabic Topics >> Udacity >> Coursera

>> Book (Cracking >> grokking >> problem solving >> Roberto reinfor + creitivty + project >> Packt
basant

Narasimha Karumanchi >>

Kulkov >> Khan-academy

))

Course San + Ud + stadnford + princteon >> Book problem solving

DS + alg Arabic + Udacity + Book

https://www.coursera.org/learn/algorithmic-toolbox/lecture/EagcP/welcome

Big-O

Above + Youtube channels + Books

+ Hakerrank >> LeetCode >> ....

Greedy Algorithms:
Algorithms on Graphs
by University of California San Diego & National Research University Higher School of Economics

Timeline
Previous weeks
START
Week 1

1. Programming Assignment, Programming Assignment 1: Decomposition of


Graphs, Jul 12
WEEK 1
Week 2

1. Programming Assignment, Programming Assignment 2: Decomposition of


Graphs, Jul 19
WEEK 2
Week 3

1. Programming Assignment, Programming Assignment 3: Paths in Graphs, Jul


26
WEEK 3
Week 4

1. Programming Assignment, Programming Assignment 4: Paths in Graphs, Aug


2
WEEK 4
Week 5

1. Programming Assignment, Programming Assignment 5: Minimum Spanning


Trees, Aug 9
WEEK 5
Week 6
WEEK 6
END 08/25
Following weeks

Next Step
WEEK 1

Welcome
It'll take about 10 min. After you're done, continue on and try finishing ahead of schedule.

Start

Instructor's Note

Thanks for signing up for Algorithms on Graphs class! We are excited that the class is beginning and
look forward to interacting with you!
More

 Week 1 information
Week 1

Estimated Time: 4h 13m

Decomposition of Graphs 1

Videos
43 min left

Readings
30 min left

Week 1: Decomposition of Graphs 1


REQUIRED GRADE DUE

Jul 12
Week 1: Decomposition of Graphs 1
REQUIRED GRADE DUE

Programming Assignment 11:59 PM PDT


Programming Assignment 1:
Decomposition of Graphs
3h

 Week 2 information
Week 2

Estimated Time: 3h 46m

Decomposition of Graphs 2

Videos
36 min left

Readings
10 min left

Week 2: Decomposition of Graphs 2


REQUIRED GRADE DUE

Jul 19
Programming Assignment 11:59 PM PDT
Programming Assignment 2:
Decomposition of Graphs
3h

 Week 3 information
Week 3

Estimated Time: 4h 15m

Paths in Graphs 1
Videos
55 min left

Readings
10 min left

Other
10 min left

Week 3: Paths in Graphs 1


REQUIRED GRADE DUE

Jul 26
Programming Assignment 11:59 PM PDT
Programming Assignment 3: Paths in
Graphs
3h

 Week 4 information
Week 4

Estimated Time: 4h 45m

Paths in Graphs 2

Videos
1h 25m left

Readings
20 min left

Week 4: Paths in Graphs 2


REQUIRED GRADE DUE

Aug 2
Week 4: Paths in Graphs 2
REQUIRED GRADE DUE

Programming Assignment 11:59 PM PDT


Programming Assignment 4: Paths in
Graphs
3h

 Week 5 information
Week 5

Estimated Time: 4h 2m

Minimum Spanning Trees

Videos
52 min left

Readings
10 min left

Week 5: Minimum Spanning Trees


REQUIRED GRADE DUE

Aug 9
Programming Assignment 11:59 PM PDT
Programming Assignment 5: Minimum
Spanning Trees
3h

 Week 6 information
Week 6

Estimated Time: 33h 39m

Algorithms on Graphs
Week 1
Graph Basics
Graph Basics
Hello everybody, welcome to the course on graph algorithms. 
The first unit in this course, we're going to be talking about various 
algorithms to compute various sorts of decompositions of graphs. 
But to begin with, 
we're going to start with the very basics about what is a graph, 
what are they good for, and how do we sort of interpret them and draw them. 
So to start with, a graph is sort of an abstract idea, and 
it's an abstraction that sort of represents connections between objects. 
Graphs are useful both in computer science and elsewhere in the world, 
because they can be used to describe many important phenomena. 
For example, the Internet often can be thought of as a graph, 
where you've got various web pages and theyre connected to each other by links. 
And this is sort of a very, very high level view of the Internet that sort 
of pays no attention to the content of a given page and other things like that. 
On the other hand, 
if you want to do something like Google's page rank algorithm, make sort of heavy 
use of this very high level view of the Internet and connectivity within it.
Another example is maps, you can think of maps as a graph, sort of where you have intersections that
are connected by roads. And this is a view that would be very useful if you, say, wanted to plot a
course between two locations. You need to find some connections, some path along this map that
gets you from one place to the other. Social networks can also be interpreted as graphs. You have
people connected by friendships or following relationship or whatever, and this provides you another
example of graph structure. You can also sort of use more complicated things, if you have some sort
of robot arm that can be in many possible configurations. What you could do is, you could think of the
configurations, some of them are connected to each other by various motions. And then if you want
to solve problems like, how do I best reconfigure this robot arm from this position to this other
position, you again, want to sort of consider this graph relating these things to each other in order to
figure out how best to do that.
Okay, so these are the sorts of problems that we would like to be able to deal with using graph
theory. So what is a graph? Well, the formal definition of an undirected graph we'll talk about
directed ones in a few Lecture s is that it's a collection V of vertices. And then a collection E of edges,
each of which connects a pair of vertices to each other. So when you draw them on the paper what
you do is your verses are usually drawn as points or circles or something. And the edges are lines or
curves or something that connect pairs of these point. So for example the graph drawn on the screen
has four Vertices labelled A, B, C, and D and it also has four edges one between A and B, one between
A and C, one between A and D and one between C and D. And so to make sure that were all in the
same page in we draw the following graph below, how many edges does it have? Well, the graph here
has 12 edges. You can sort of count them all and each of these segments between two of these points
gives us an edge. And so, there are 12 of them.
Now, there's one thing that sort of fits into this definition of graphs that I gave that is a little bit weird
perhaps, or unusual. One thing that you might have is loops, this would be an edge that connects a
vertex to itself. So here we just have the one vertex A, and then a loop that connects A to itself. And
there are some instances in which you might want to use this. Sometimes you actually do have traffic
circle that just goes in a loop or something like that. But only sometimes this is the sort of a thing we
we're talking about. Also sometimes you can have multiple edges between the same pair of vertices.
So in this picture we have two vertices A and B and two edges between them. And somehow if you're
just talking about, can I get from A to B? It doesn't matter if there's one edge or two between them.
But sometimes having both of them is relevant. On the other hand a lot of the time, these loops and
multiple edges are just a sort of a that we don't really need to deal with. So often times you work with
what are known as simple graphs which don't have loops and don't have multiple edges. And so, this
is sort of the basics of what a graph is that we've just covered. Next time we're going to talk about a
couple of things. First thing we're going to talk about how to represent graphs on a computer and
then we're going to talk a little bit about how we talk about runtimes for graph algorithms. So I hope
this has been a useful introduction and I'll see you at the next Lecture when we talk about these
other things.

Representing Graphs
0:01
Hello everybody, welcome back to our graph algorithms. Of course, today we're going to be talking
about how to represent graphs on a computer. And then after that we are going to talk a little about
how to talk about graph runtime instead of how it depends on one graph you are using.
Play video starting at 17 seconds and follow transcript0:17
So the first thing is graph representations, so from last time recall, the graph is a thing that consists of
a bunch of vertices sometimes called nodes and also a bunch of edges that connect pairs of these
vertices.
Play video starting at 30 seconds and follow transcript0:30
And we're going to want to do is we're going to want to do a lot of computations on graphs to
determine various properties of them, but before we can get to any of that, we first need to talk
about how to actually represent a graph inside a computer. And there are in fact several ways to do
this, and exactly how we do it will affect the runtimes of some of these algorithms. So we're going to
spend a little bit of time talking about that.
Play video starting at 53 seconds and follow transcript0:53
So, the first thing to do is well you want your vertices and you want your edges. A natural way to do
this is just store a giant list of edges. Each edge is a pair of vertices so this is a big list of pairs of
vertices. But now what we have is we've got these four vertices, A, B, C, and D. And then we just store
a list. One is an edge between A and B, and then there's an edge between A and C, and there's an
edge between A and D, and there's an edge between C and D. And this is a perfectly good way to
store this graph on a computer, but there are other ways of thinking about it. One of them is the
following. So at least, that you have a simple graph. We're only didn't really matter is which pairs of
vertices have edges between. And so, if I gave you a pair of vertices. So, it's the only thing that you
need to know is you need to know, is there an edge between the more is there not. So we can do is
we can just rebuild a look-up table for this, we build a matrix where you have an entry that's 1 if there
is an edge between that pair of vertices and a 0 if there's not.
Play video starting at 1 minute 53 seconds and follow transcript1:53
So for example, in this graph there is an edge between A and D and therefore the AD entry of this
matrix is 1.
Play video starting at 2 minutes 2 seconds and follow transcript2:02
There isn't an edge, however, between B and C, so the BC entry is 0. And so you just fill this in, you get
this nice 4 by 4 matrix of 0s and 1s. It is another way of representing this graph.
Play video starting at 2 minutes 16 seconds and follow transcript2:16
On the other hand there is sort of a third maybe a hybrid way of looking at things. And the idea here
is that for each vertex in the graph it's going to have a bunch of neighbors, it's going to have other
vertices that have edges from the one that we're considering to them.

And another way of representing the graph is for each vertex we can just keep a list of its neighbors.
So vertex A in this graph is adjacent to B, C, and D. B is adjacent to A. C is adjacent to A and D. D is
adjacent to A and C. So for each of these vertices we store a list of just all of it's neighbors and that's
another way to represent the graph. Now for example, just to be sure we're on the same page. If we
have the following graph. What are the neighbors of vertex C?
A

This should be selected 


is not selected.This is wrong. It should be selected.

This should be selected 


is not selected.This is wrong. It should be selected.

Un-selected is correct 
C

is not selected.This is correct.

This should be selected 


is not selected.This is wrong. It should be selected.

Un-selected is correct 
is not selected.This is correct.

This should be selected 


is not selected.This is wrong. It should be selected.

Un-selected is correct 
is not selected.This is correct.

This should be selected 


is not selected.This is wrong. It should be selected.

This should be selected 


is not selected.This is wrong. It should be selected.

Un-selected is correct 
Play video starting at 3 minutes 6 seconds and follow transcript3:06
Well, we just draw all the edges that leave C and it connects to A, and B, and D, and F, and H, and I.
And, those are its neighbors. Okay, but so we have these three ways of representing a graph. And it
turns out the different basic operations you want to perform could be faster or slower depending on
which representation you have.
Play video starting at 3 minutes 27 seconds and follow transcript3:27
For example, if you want to determine if there's an edge between this specific pair of vertices. This is
very fast in the adjacency matrix representation. All you have to do is check the appropriate entry of
the matrix. See if it's 0 or 1. And that's your answer, constant time. However, when you have an edge
list, the only thing that you can really do is scan the entire list of edges to see if that particular edge
shows up. That takes time proportional to the number of edges. For the adjacent lists, it's a little bit
faster than that. What you do is you pick one end of this edge that you're looking for and then look it
for all it's neighbors who wants to know if A is adjacent to B, you look at the list of A's neighbors and
see if B is on the list. And so the time there is proportional to the degree of the vertex. The number of
neighbors that this one guy has.
Play video starting at 4 minutes 15 seconds and follow transcript4:15
Now another thing you might want to do is list all of the edges in the graph. And the adjacency matrix
is terrible here because the only thing you do there is you scan through every entry of the matrix. And
each one gives you an edge, but that takes time proportionate to the number of vertices squared.
Play video starting at 4 minutes 33 seconds and follow transcript4:33
However, the edge list does this very easily. It just lists all the edges in order which is what it does.
Play video starting at 4 minutes 40 seconds and follow transcript4:40
The adjacency list is about as good though, because you just need to, for each vertex, you find all of
its neighbors. Those are a bunch of edges. And you just do this for every vertex, and this actually
counts each edge twice, because if there's an edge between A and B you count both. That A is
neighbor of B and that B is a neighbor of A. But the only counts them twice so the time is still
proportional to the number of edges.
Play video starting at 5 minutes 6 seconds and follow transcript5:06
Finally, if you want to list all the neighbors of a given vertex, the adjacency matrix is pretty slow here
again. Because the only thing you can really do is scan through that row of the matrix, and find all the
ones.
Play video starting at 5 minutes 19 seconds and follow transcript5:19
Now, in the edge list representation, you have scan through all the edges In your graph to see which
ones include the vertex you want. Whereas the adjacency list, you just scan through the list of
neighbors of the vertex, and it's very fast. Now, it turns out that for many problems that we'll be
considering this on, and throughout most of the rest of this unit on what we really want is the
adjacency risk representation of our graph, just because a lot of the operations that we need, we
really want to be able to find neighbors of a given vertex.
Play video starting at 5 minutes 51 seconds and follow transcript5:51
Okay, so that's how we represent graphs. Let's spend a little bit of time to talk about runtimes of
graph algorithms.
Play video starting at 5 minutes 58 seconds and follow transcript5:58
So, when we talk about algorithm runtimes, it's generally a function of the size of the input.
Play video starting at 6 minutes 5 seconds and follow transcript6:05
And for the most of the algorithms we've seen so far, the input sort of has one size parameter, n. And
so you runtimes like o of n squared or o of n cubed, or something to that effect.
Play video starting at 6 minutes 16 seconds and follow transcript6:16
However, graph algorithms, well, the graph sort of has two measures of its size. The number of
vertices and the number of edges. And so graph algorithms runtimes depend in some way on both of
these quantities. You could have runtimes like O size of E plus size of E, which is generally considered
to be linear time. But you can also have O of size of E, times size of E or size of V to the three-halves or
V log V plus E. And there are many different possibilities for types of runtimes that we have.
And an interesting question now is, again, if you sort of only have one parameter, it's easy to
compare. N log N runtime versus N squared versus N cubed. But if you want to say which one's faster,
say one algorithm runs in time size of V to the three-halves and the other one runs in time size at E.
It's not actually clear which one's faster.
Play video starting at 7 minutes 13 seconds and follow transcript7:13
And in fact which algorithm is faster actually depends on which graph you're using. In particular
depends on the density how many edges you have in terms of the number of vertices. If you have
many, many, many edges O of E is going to be worse. However, if you have few edges its going to be
better. And in terms of density theres sort of two extremes to consider.
Play video starting at 7 minutes 36 seconds and follow transcript7:36
On the one hand, you have dense graphs. Here, the number of edges is roughly proportional to the
square of the number of vertices, meaning that almost every pair of vertices or some large fraction of
the pairs of vertices actually have edges between them. If you're banned, you're going on tour and
want to like plot a route between a bunch of cities, well there is actually some transportation option
that would get you between basically any pair of cities on the map. Now, it might be indirect or take a
while but you should still probably plot all of these out in order to plan your tour, just what will
matter is not whether or not it's possible to get between two cities, but how hard it is to get between
these cities. But for this time, you actually want sort of a dense graph, that talks of all pairs relations
between these vertices.
Play video starting at 8 minutes 30 seconds and follow transcript8:30
On the other end of the spectrum, we have sparse graphs. Here the number of edges is small relative
to the number of vertices, often as small as roughly equal to the number of vertices or some small
constant times the number of vertices.
Play video starting at 8 minutes 45 seconds and follow transcript8:45
And in a sparse graph, what you have instead is that each vertex has only a few edges coming out of
it.
Play video starting at 8 minutes 52 seconds and follow transcript8:52
And this is actually very reasonable for a lot of models you want to think about. If you think of the
internet as a graph, well, there are billions of web pages on the internet, but any given web page is
only going to have links to a few dozen others. And so the number or the degree of any given vertex,
the number of neighbors that any given vertex has, is much smaller than the total number of vertices,
so you end up with a very sparse graph. Now a lot of things like this, like the Internet, like social
networks, like actual maps that you've drawn on a piece of paper, these tend to be sparse graphs.
Play video starting at 9 minutes 28 seconds and follow transcript9:28
Whether or not your graph is sparse or dense will affect the runtime of your algorithms and may even
help determine which of your algorithms you want to be run.
Play video starting at 9 minutes 37 seconds and follow transcript9:37
Okay, so that's it for today.
Play video starting at 9 minutes 40 seconds and follow transcript9:40
Now that sort of the basics are out of the way we're going to actually talk about some of the
algorithms and in particular we're going to talk about how to explore graphs and sort of figure out
which vertices can be reached from which others so come back next time and we'll talk about that.

Slides and External References

Slides
09_graph_decomposition_1_basics.pdfPDF File
09_graph_decomposition_2_representations.pdfPDF File

Reading
Section 3.1 in [DPV]

If you find this lesson difficult to follow


Section on graph representation at Algorithms class by Tom Cormen and Devin Balkcom at Khan
Academy

References
[DPV] Sanjoy Dasgupta, Christos Papadimitriou, and Umesh Vazirani. Algorithms (1st Edition).
McGraw-Hill Higher Education. 2008.
Exploring Undirected Graphs

Exploring Graphs
Hello everybody, welcome back to the Graph Algorithms course. Today we're going to talk about
algorithms for exploring graphs, in particular, ways to tell us which vertices can be reached from
which others. So for an example of the type of problem we're trying to solve here, suppose that
you're playing a video game. And you found the exit to some level, but you don't want to go there just
yet. You want to first sort of explore this whole level and make sure you found all the secrets inside it,
make sure you get all the loot and XP that there is to be found. And you can certainly wander around
between the various rooms and find a bunch passageways and so on. And you've been wandering
around for a while and maybe haven't discover anything new yet. But you'd like to make sure that
you've really found everything before you leave. How do you accomplish this? How do you make sure
that found everything? And this sort of is a notion of exploring a graph. You've got theses rooms,
they're connected by passageways. And you want to make sure that everything, at least everything
that's reachable from where you started, you can actually get to it and find, explore the entire region.
Play video starting at 1 minute 6 seconds and follow transcript1:06
And this actually sort of, this of related questions are actually very useful in a number of applications.
For example, if you have a map and you want to find a route between some location on the map and
some other, it often depends on this sort of exploration of the graph, making sure that you can find
some sequence of roads to follow to get you where you want to go. This could also be used if you're
sort of building a road system and want to ensure that the entire thing is connected. You need to
have some sort of algorithm to tell what can be reached from what. And finally, it has more
recreational things, the video game example, but also if you want to solve the maze. This is very
much, well, can I get from the start to the finish in this sort graph that connects things? But also
various puzzles, if you think of sort of vertices as sort of possible configurations. And then edges that
describe moves that you can make and want to say, can I rearrange the puzzle to end up in this
location? Again, this sort of exploration algorithm becomes very important.
Play video starting at 2 minutes 7 seconds and follow transcript2:07
So before we get into this too much, we really want to formalize what we mean. In particular, we
want to know what does it mean to be reachable from a given vertex.
Play video starting at 2 minutes 17 seconds and follow transcript2:17
And basically, the idea is you start at the vertex and you're allowed to follow edges of the graph. And
you want to able to see what you can end up at. And so to formalize this, we'd find a path in a graph
G to be a sequence of vertices v0, v1, v2,
Play video starting at 2 minutes 32 seconds and follow transcript2:32
etc., such that each vertex is connected to the next one by an edge of the graph. So we get some
sequence of edges to follow.
Play video starting at 2 minutes 41 seconds and follow transcript2:41
So the formal description of the problem that we would like to solve is given a graph G, and a vertex s
in G, we'd like to find the collection of all vertices v in the graph such that there is a path from s to v,
everything that we can reach from S. So, just to get a little bit of practice, if we have the following
graph, which vertices here are reachable from A? Well think about it a little bit and you find out that
A, C, D, F, H, and I are all reachable. And it's easy to see that these vertices sort of do all connect up
from edges, but you then can't get to B, E, G or J. And this is sort of because there are no edges that
connect any of these vertices we can reach to any of these other vertices. And so there's no way to
escape, except for these six.

Play video starting at 3 minutes 33 seconds and follow transcript3:33


And this is sort of the actual idea behind the algorithm. What you want to do is you want to make
sure that you can actually find everything that you can reach. And so what you do is you sort of
expand, you find a bunch of vertices. These are a bunch that I can reach. But then if there are any
edges that connect ones that you can reach to ones that you can't reach so far, well, you have to sort
of explore those edges and find the new vertices on the other end, and sort of add them to the list
that you know about. And you sort of keep expanding this list of vertices that you know you can get to
until you can't connect to anything new and then you're done.
Play video starting at 4 minutes 8 seconds and follow transcript4:08
So to formalize this algorithm a little bit, we are going to keep a list of DiscoveredNodes.
And this starts out just with the vertex axis you're supposed to start at. But then what you do is while
there is an edge e that leaves this set of DiscoveredNodes that connects to something you have
discovered to something you haven't discovered, then what you do is you take the vertex at the other
end of that edge and add it to your list of DiscoveredNodes. And you just keep doing this until there's
nothing new to be found. And then you return this list of DiscoveredNodes.
Play video starting at 4 minutes 41 seconds and follow transcript4:41
Okay, that's a reasonable algorithm and it does work. But in order to really code this up, you need to
do some work to handle the bookkeeping that's required for this algorithm. You need to do things like
you need to keep track of which vertices you've discovered and which edges you've dealt with, which
edges you've actually checked and which ones you haven't. You also need to know sort of which order
to explore new edges in. If there are several possible edges to follow, which one do you follow next?
Play video starting at 5 minutes 10 seconds and follow transcript5:10
And so, what we're going to do now is talk about a specific way to implement this and deal with these
sort of bookkeeping isues.

Play video starting at 5 minutes 18 seconds and follow transcript5:18


The first thing that we need to do is we need to keep track of which vertices we've already found. And
for this, we're going to associate a boolean variable to each vertex visited(v) which basically tells us
have we visited it yet.
Play video starting at 5 minutes 31 seconds and follow transcript5:31
The next thing that we're going to need to do is we need to, most of the vertices that we visited will
actually will have already sort of checked all of the edges relating to them. But some we haven't and
we somewhere need to keep track of the list of vertices that still have edges hanging off of them that
might connect this to something new. Now this list isn't going to appear explicitly in our program. It'll
actually sort of to be hidden in the program stock so this is that points a little bit sudden. We'll sort of
see it once we introduce the algorithm.

Play video starting at 6 minutes 2 seconds and follow transcript6:02


The final thing is we need to discover which order to discover, to follow new edges in. And for this we
are going to use what is known as the Depth First order. What this means is we're just going to start
our initial vertex and we're just going to start following a chain of edges. We're just going to follow
some really long path forward until one of two things happens. One thing is we could stumble across
a vertex that we have already seen before. In which case there's no reason to have followed that edge
and we'll just back up to where we were before.
Play video starting at 6 minutes 32 seconds and follow transcript6:32
The second thing that could happen though is that we hit a dead end. And we actually hit a dead end
and can't go any further forward, then we actually back up. And then once we back up though, we
don't just back all the way to the beginning. We just back up once step and then try going forwards
again from that new vertex that we found.
Play video starting at 6 minutes 50 seconds and follow transcript6:50
Okay, so that's the basic idea. How do we implement this? Well part of the beauty about this is that
we have a very simple recursive algorithm. So Explore(v), the first thing you do is you set the visited
marker of v to be true. We say we have visited it. Next, for each neighbor w of v, for each w that's
adjacent to v, if w has not already been visited, we recursively explore w.
Play video starting at 7 minutes 18 seconds and follow transcript7:18
Okay, so this is a very compact program. Let's sort of actually see what it does on an example.
Play video starting at 7 minutes 25 seconds and follow transcript7:25
Right, also I should mention that in order for this program to execute reasonably efficiently, we really
want an adjacency list representation of our graph.
Play video starting at 7 minutes 34 seconds and follow transcript7:34
That's because we have this for loop we want to iterate over all of the neighbors of v in our graph.
And for that, if you have an adjacency list, which gives you a list of all the neighbors of v, that's
incredibly easy to do. If you don't have an adjacency list on the other hand, this algorithm really isn't
that efficient.
Ppt slides
Play video starting at 7 minutes 53 seconds and follow transcript7:53
Okay, fine. So let's look at in the example. Here's a graph, we're going to start by exploring that
vertex.
Play video starting at 8 minutes 1 second and follow transcript8:01
So we mark it as visited. We then check for unvisited neighbors. And hey, look there is one. So we
recursively explore that other vertex. We mark it as visited. We search for unvisited neighbors. And
we have this one. So remember now we're sort of three layers into the program stack here. This is
sort of a sub routine of a sub routine. But now when we're exploring this vertex it has no unvisited
neighbors. So after we've done a little bit of checking, we decide that we're done exploring this guy
and we pop the stack. This other guy still we visited both of his neighbors, we pock the stock back to
the original explorer call. Now this vortex does have some unvisited neighbors left, so let's visit one of
them and explore that. Mark it as visited, find an unexplored neighbor, explore that. Mark as visited,
unexplored neighbor, mark it as visited, unexplored neighbor. Now when we explore this vertex
though, once again we're stuck. So we wrap up exploring that guy, pop a level up the stack, go back to
exploring this other vertex, who now actually does have another unvisited neighbor. So we're going to
go visit that one. Now we've actually visited everything in the graph. So all we're going to do is at each
vertex, we're going to note that all of their neighbors have been visited. We're going to pop up the
stack and get back to where we started and conclude. So here we actually have found all these
vertices. And in fact if you look at it, we've actually figured out how to reach them all. If you look at
sort of these darker edges which are the ones our algorithm sort of actually followed when you ran it,
Play video starting at 9 minutes 36 seconds and follow transcript9:36
these sort of connect up to all the other vertices. And they actually tell us if you sort of follow these
edges, they give you a unique path to any other vertex that we can reach.
Play video starting at 9 minutes 48 seconds and follow transcript9:48
Okay, so that's our algorithm. Let's talk about correctness. And the theorem is that if all the vertices in
our graph start as unvisited, when we run Explore(v) it marks as visited exactly the vertices that are
reachable from v. And the proof here isn't so bad. The first thing to note is that we only ever explore
things that are reachable from v. And that's because, well, the way our recursive calls work are we
either start at v, or we explore a neighbor of a vertex that we've already explored. So any vertex that
we end up exploring has to be a neighbor of a neighbor of a neighbor of a neighbor of a neighbor of a
neighbor of the original vertex or something. But that does basically give us a path and say that
wherever we got to was reachable.
Play video starting at 10 minutes 35 seconds and follow transcript10:35
The next thing to note is that a w, vertex w, is not marked as visited unless it has already been
explored, which is just the only way we mark things as visited is when we explore them.
Play video starting at 10 minutes 48 seconds and follow transcript10:48
But finally, we should note that if w gets explored, well, we then look at all the neighbors of w. And
either those neighbors have already been visited, in which case it means they've been explored at
some point, or we end up exploring them.
Play video starting at 11 minutes 4 seconds and follow transcript11:04
So in other words, if w gets explored, all of its neighbors also get explored.

Play video starting at 11 minutes 10 seconds and follow transcript11:10


So to finish things, suppose that we have some vertex u that is reachable from v. That means that
we've got a path from v going up to u. And if we actually explored everything along this path we'd be
done. We would have explored u at some point.
Play video starting at 11 minutes 26 seconds and follow transcript11:26
So let's assume that we don't. Assume that w's actually the furthest thing along this path that we've
explored.
Play video starting at 11 minutes 33 seconds and follow transcript11:33
However, by what we had on the previous slide, if you explore a vertex you also explore all of its
neighbors. So the next vertex z along this path must also be explored. And so this is a contradiction.
This says the only way this can work is if we actually explored every vertex along the path. But that
means we've explored u, which is good.
Play video starting at 11 minutes 55 seconds and follow transcript11:55
Okay, so this explore algorithm is actually really great if we just want to find the vertices that are
reachable from a given one.
Play video starting at 12 minutes 2 seconds and follow transcript12:02
But sometimes you want to do a little bit more. You actually want to find all the vertices of the graph
G, not just those coming from a given one.
Play video starting at 12 minutes 10 seconds and follow transcript12:10
So for this, we're going to use a slightly more complicated algorithm called depth first search.
Play video starting at 12 minutes 15 seconds and follow transcript12:15
And for this what we do is the following, first we mark everything as unvisited.
Play video starting at 12 minutes 20 seconds and follow transcript12:20
Then for each vertex in the graph, if it has not yet been visited, we explore that vertex.
Play video starting at 12 minutes 27 seconds and follow transcript12:27
So to look at an example on this graph, we find an unvisited vertex, say that one, and we explore it. So
we find a neighbor and another neighbor and then we pop back up the stack. And then we find
something adjacent to our original vertex and come back. And now we're done exploring that first
vertex.
Play video starting at 12 minutes 45 seconds and follow transcript12:45
So now we look for a new vertex we've never visited before, like say that one. We explore its
neighbor, come back, we're done exploring that guy. We find a new vertex that we haven't visited,
that one. We explore its neighbor and his neighbor and then come back. And now that we've actually
visited all the vertices, only now do we sort of wrap up and conclude our algorithm.
Play video starting at 13 minutes 7 seconds and follow transcript13:07
So to analyze the run-time of this algorithm, we have to note a few things.
Play video starting at 13 minutes 13 seconds and follow transcript13:13
Firstly, whenever we explore a vertex, we immediately mark it as being visited. That's the first thing
that we do.
Play video starting at 13 minutes 21 seconds and follow transcript13:21
The next thing to note is that no vertex ever gets explored if it's already been visited. And in fact if you
look at every time we make an explore as even a recursive call, then we always first check if it has not
been visited, then we explore it. And this means that no vertex gets explored more than once. In fact,
this means that each vertex gets explored exactly once because the outer loop in the DFS explores
every vertex if it hasn't already been visited.
Play video starting at 13 minutes 50 seconds and follow transcript13:50
But each vertex in our graph gets explored exactly once.
Play video starting at 13 minutes 55 seconds and follow transcript13:55
But for each vertex, when we explore it, there's this inner loop where we have to check all neighbors
of that vertex.
Play video starting at 14 minutes 2 seconds and follow transcript14:02
And so we have to do work for each neighbor of each vertex.
Play video starting at 14 minutes 8 seconds and follow transcript14:08
And so we have to be more proportional to the total number of neighbors over all vertices. And that's
proportional to the number of edges because each edge connecting A and B says that A is a neighbor
of B and that B is a neighbor of A. So it contributes to two neighbors. But the total amount of work is
still O of the number of edges.
Play video starting at 14 minutes 26 seconds and follow transcript14:26
So total work we do, O of one work per vertex and O of one work per edge and the total run time is O
of size of V plus size of E, a nice linear time algorithm.
Play video starting at 14 minutes 38 seconds and follow transcript14:38
So that is depth first search. Next time we're going to talk a little bit more about reachability and
graphs and give some more applications of this algorithm. So, I'll see you then.

Connectivity
Hello everybody, welcome back to the graph algorithms course. Today we're going to be talking about
notions of connectivity in a graph, in particular, talking about connecting components, and how to
compute the connected components of an undirected graph.
Play video starting at 14 seconds and follow transcript0:14
So, to begin with, last time we talked about this notion of reachability, what does it mean to be able
to reach another vertex from a given one? And what we'd like to understand now is really what sort
of classifies reachability in a graph? Really not just from a single vertex but you sort of want to
understand which vertices are reachable from which others? And it turns out that for an undirected
graph, there's actually a very clean categorization of this. In particular for any graph G, we can always
partition it into Connected Components such that two vertices v and w are reachable from each
other, if and only if they are in the same connected component. So, sort of we break this graph up
into islands, where any two islands are completely unconnected from each other, but within any
island, you can get from anywhere to anywhere else.

Play video starting at 1 minute 4 seconds and follow transcript1:04


So just to make sure we're on the same page. If we have the following graph, how many connected
components does it have?
Play video starting at 1 minute 12 seconds and follow transcript1:12
Well, this graph is going to have 4. So if you look at these four things, from each of these things
they're all connected up, but there are no edges between any pair of them.
Play video starting at 1 minute 23 seconds and follow transcript1:23
Okay, so let's see how the proof of this theorem works. Why can we always divide it into connected
components? And basically the idea is you need to show that reachability is an equivalence relation.
That means three things, first is that v is reachable from v which is easy. Also that if v can be reached
from w, w can be reached from v which sort of is just following the path in the opposite direction. And
then there's the difficult one, that if v is reachable from u and v is also reachable from w, then w is
reachable from u. What this says if you sort of take v and you take everything that it connects to, in
fact, everything in that whole region connects to everything else in that whole region.
Play video starting at 2 minutes 4 seconds and follow transcript2:04
And that last step is not so hard once you write it down. The point is that you can reach u from v,
which means there's a path from v to u. You can also find a path from v to w. And if you should have
glued these paths together at v then you have a path from u to w. So they can be reached from each
other. Okay, so that completes the proof but what we'd now like to do is do this algorithmically. Given
a graph G, we'd like to find the connected components of this graph.
Play video starting at 2 minutes 33 seconds and follow transcript2:33
Okay, fair enough. Let's find an algorithm. Well the basic idea of this is pretty simple. If you run
explore(v), this finds the connected component of v. It finds everything you can reach from v. That's
the whole connected component. And you just need to repeat this to find all the other components.
Play video starting at 2 minutes 52 seconds and follow transcript2:52
It turns out you can do this actually with only sort of a slight modifications of the original depth first
search algorithm tha we saw.
Play video starting at 2 minutes 59 seconds and follow transcript2:59
Now, in order to make this algorithm, we're about to see a little bit more cleanly. We're going to
modify our goal a little bit. Instead of actually returning sort of returning sets of vertices that are our
connected components, what we're going to do is we're going to label the connected components,
we're going to give everything in the first one a label of 1, and everyone in the second components a
label of 2, and so on and so forth.

Play video starting at 3 minutes 20 seconds and follow transcript3:20


Okay, so how does this work? First we modify our Explore procedure, we visit the vertex, we also
make sure to visit all of its neighbors. We also assign the vertex a number corresponding to it connect
component, the CC number.
Play video starting at 3 minutes 35 seconds and follow transcript3:35
Now the thing is that this variable CC is not going to change as we make these recursive calls to
explore. And so everything else that gets found through the same explore will all get assigned the
same number.
Play video starting at 3 minutes 49 seconds and follow transcript3:49
However when we actually run our DFS first search, we're going to between different explores,
between different vertices that we explore and then go off to find a new vertex, we increment this
number. And so the second explore that we do everything gets the number one bigger.
Play video starting at 4 minutes 6 seconds and follow transcript4:06
So to see how this actually works, on this graph our counter starts at 1. We find it on vertex say that
one, it gets labeled with a 1. We keep exploring from that vertex and we find these other three, all of
them get labeled with 1.
Play video starting at 4 minutes 22 seconds and follow transcript4:22
But now we are done exploring that first vertex, so we increment the counter and find a new
unexplored vertex, say that one. Which we now label 2. We keep exploring and the other vertex that
we found also gets labeled with a 2. But then we finish that explore call.
Play video starting at 4 minutes 39 seconds and follow transcript4:39
We now increment our counter to 3 and find a new vertex, this one. We explore that and everything
we find gets labeled with a 3. And now we're done. We've got three connected components.
Everything in the first one is labeled 1. Everything on the second one is labeled 2 and everything in the
third one is labeled 3. So sort of what we wanted.
Play video starting at 4 minutes 59 seconds and follow transcript4:59
Okay, to show that this is correct, we note that each new explore call finds a full, new connected
component because it finds everything connected to that V. And of course, because V had not been
found yet, it was not in a connected component that we had already seen.
Play video starting at 5 minutes 16 seconds and follow transcript5:16
Secondly, because of the way this outer loop and depth for search works, every vertex eventually gets
found, because every vertex we eventually check to see if it's visited, and if not, we explore it.
Play video starting at 5 minutes 28 seconds and follow transcript5:28
Now the run time of this is basically just the run time of depth for search. We only made sort of
cosmetic modifications to it. So the run time is still O of size of V plus size of E. Okay, so that's our
discussion of how to compute connected components. Next time we're going to talk about some
other applications of depth for search. So I'll be happy to see you in the next Lecture .

Previsit and Postvisit Orderings


Hello everybody, welcome back to the Graph Algorithms course. Today, we're going to talk about
previsit and postvisit orderings. And these are sort of some numbers that you can do when running a
depth for search that sort of records some information about the order in which you visited vertices.
And and we're going to discuss a little about why these numbers might be important. But mostly it's
going to be that they will turn out the be very useful in some algorithms we're going to discuss later.
Okay, so let's talk about depth [INAUDIBLE], it is algorithm that we came up last time. And it's a little
bit weird as an algorithm, because well what happens when you run depth for surge? It doesn't return
anything, it does modify the state of some things. It marks vertices as visited or unvisited. But in the
end it just ends up marking any vertex as being visited. If all that you wanted to do is mark every
vertex as visited there are easier ways to do it.
Play video starting at 58 seconds and follow transcript0:58
On the other hand the order in which we find these vertices and away that involves their connectivity
is actually very useful. For example with slight modification, keeping track of a little bit of data we saw
how to use depth for a search to compute connected components. So in general, if we want to make
that depth [INAUDIBLE] useful, we need to keep track of some extra information about its execution.
Play video starting at 1 minute 24 seconds and follow transcript1:24
So what we're going to do for this is we're going to augment some of its functions in order to store
additional information.
Play video starting at 1 minute 30 seconds and follow transcript1:30
So for example, let's look at Explore. What we're going to do is we're going to mark the vertex as
visited. But then before we do anything else, we're going to run some previsit function. This is just
some extra things that we're doing, just to maybe record some information or do a little bit of extra
work sort of just as we found this new vertex.
Play video starting at 1 minute 50 seconds and follow transcript1:50
Then we go in this loop over neighbors and exploring all the unexplored neighbors. And then right
before we're about to finish exploring v, we run some other post visit blocks. Some thing that we did,
do it right before we're finishing. So what are these functions going to be? They could be a number of
things. We'll come up with an example shortly, but the idea is to augment our functions a little bit to
keep track of some extra information.
So what sort of extra information do we want? Well one that we might want to do is to keep track of
sort of what order are we visit vertices in. And so one way to do this is we sort of have a clock. This
clock keeps track of time, it ticks forward once every time we hit a previsit or postvisit for a vertex.
Every time we discover a new vertex for the first time or sort of leave an old vertex for the last time.
And every time we do one of these things we'll also core the time which that happens. So for each
vertex, we will have a record of the previsit time and the postvisit time. So to what you seen what I
mean by this, let's look again this example.
PPT Slides
PPT Slides

The clock starts at 1, we visit our first vertex, its gives us previsit number assigned as 1. From there,
we explore the second vertex which is previsit 2, and then a third vertex which gets previsit of 3. From
there, we start, well, we're done exploring that vertex. All of its neighbors have already been visited.
So we assign it postvisit 4 and move on, this other vertex gets postvisit 5. We now have a new vertex
to explore, it gets previsit 6 and postvisit 7. And our original postvisit 8 and that wraps up our first
explore.
Play video starting at 3 minutes 32 seconds and follow transcript3:32
We now find a new vertex, who gives previous at 9 and his neighbor previous at 10. And then they get
post visit numbers 11 and 12. Finally, we've got a new guy with previsit 13 and his neighbors get 14
and 15 and they get postvisits 16, 17 and 18, we wrap them up. And we are now done and we just
assigned the numbers 1 through 18 as the previsit and postvisit numbers of these 9 different vertices.
So the way we compute this is actually pretty easy.

We just have to initialize our clock to be 1 the beginning of our depth first search. And then in the
previsit block of our explorer, we set the previsit number of v to be this clock value and then
increment the clock. And for the post visit block we set post visit number of the vertex to be the clock
value and increment the clock. Very easy, doesn't really change the run time of anything, but allows
us to compute these numbers. Now, what are these useful for? Well, really the previsit and postvisit
numbers, the tell us something about the execution of the depth first search. And we'll see a little bit
of this as we prove the following Lemma. So suppose that you have any two vertices u and v. From
these u, v intervals pre(u) to post(u) and then pre(v) to post(v), these are two intervals. And the claim
is that these intervals are either nested or disjoint. And in a particular whether are nested or disjoint
will actually tell us something useful about the sort of way in which the depth for search ran on these
two vertices.
Play video starting at 5 minutes 4 seconds and follow transcript5:04
So let's take a look at what we mean by this. If you have two intervals, there are a few ways in which
they can intersect with each other.
Play video starting at 5 minutes 11 seconds and follow transcript5:11
Firstly, could be the case that one interval is strictly contained in the other one. These means that
they're nested.
Play video starting at 5 minutes 18 seconds and follow transcript5:18
It could also be that they're disjoint from each other, that they don't overlap at all. Finally, we could
have two intervals that are interleaved. They sort of overlap over part of their ends, but not over the
entire interval either side. And what we're saying in this dilemma is that the interleaved case is
actually not possible. So let's see the proof. We first assume that without laws of generality, the first
visit to u happens before the first visit to v.
Play video starting at 5 minutes 46 seconds and follow transcript5:46
We now have two cases to consider. First thing that we find v for the first time in the midst of
exploring u. And this really is an indepth research tree that we produced by this exploration. We
found v as a descendant of u, we found v while exploring u.
Play video starting at 6 minutes 6 seconds and follow transcript6:06
The second thing is we could find v after we're done exploring u. So it's sort of v and u are different
branches of the tree, they're cousins of each other.
Play video starting at 6 minutes 15 seconds and follow transcript6:15
So let's look at the first case. If we explore v while we're in the middle of exploring u, then explore v is
actually being run as a subroutine of explore u. And because of the way subroutines work we can't
finish exploring uuntil we are finished exploring v. And therefore the post of u is bigger than the post
of v and these two intervals are nested.
Play video starting at 6 minutes 38 seconds and follow transcript6:38
In the second case, we explore the after we're done exploring u. So we start exploring u and then we
finish exploring u, and then sometime later we start exploring v and then finish exploring v. And so
these intervals are going to be disjoined.
Play video starting at 6 minutes 53 seconds and follow transcript6:53
And so these are the only cases the intervals are nested or they're disjoined. And which case we're in
will actually tell us something useful about the order in which we visited vertices in this graph.
Play video starting at 7 minutes 5 seconds and follow transcript7:05
Now just to review we've got these two tables here sort of pre and post numbers, but only one of
them is a possibly valid table of pre and post numbers. Which one is not valid? Well the one on the
right can't be correct, because these two intervals from pre to post for vertices A and B Are
interleaved, and that can't happen. So that sort of pre and post orders will see a lot more usefulness
later, but in the next Lecture , we're going to introduce something a little bit more new. We've been
talking a lot about undirected graphs. Here, we're going to be talking about what happens when the
edges actually have an orientation to them. So please come back next time, and we can talk about
that.

Slides and External References

Slides
09_graph_decomposition_3_explore.pdfPDF File
09_graph_decomposition_5_pre-and-post-orders.pdfPDF File
09_graph_decomposition_4_connectivity.pdfPDF File

Reading
Section 3.2 in [DPV]

Visualizations
Depth-first search by David Galles

References
[DPV] Sanjoy Dasgupta, Christos Papadimitriou, and Umesh Vazirani. Algorithms (1st Edition).
McGraw-Hill Higher Education. 2008.

Programming Assignment: Programming Assignment 1:


Decomposition of Graphs
Week 2
Estimated Time: 3h 46m
Decomposition on Graph 2
https://www.cs.upc.edu/~jordicf/Teaching/AP2/pdf4/09_Graphs_Connectivity-2x2.pdf

https://opendsa-server.cs.vt.edu/ODSA/Books/CS3/html/GraphTraversal.html
Directed Graph
Directed Acyclic Graphs
Hello, everybody. Welcome back to the graph algorithms course. Today, we're going to start talking
about directed graphs versus undirected, in particular, talk about directed acyclic graphs and some of
their properties. So what's the motivation for this? The point is that sometimes we want to talk about
the edges of a graph that have a direction.
Play video starting at 22 seconds and follow transcript0:22
This is just because I mean, sometimes, like pairs of things are related in a way, but the relation isn't
symmetric. One of them comes before the other, or it's a one way viewership or something like that.
And so we define a directed graph to be a graph, where each edge has sort of a designated start end
and an end end. So what are examples where you might want to use this? For example, if you've got a
city map where lots of your streets are one way roads, the direction that the road's pointing is
actually very relevant if you want to navigate the city. You can't just follow any series of roads you like
because you'll be driving down some one way streets in the wrong order, so you really do need to
keep track of this orientation if you want to navigate the city. But then also some of the examples that
we gave, links between webpages, the web graph is probably a directed graph, because usually, if
you've got two webpages and A links to B, B probably doesn't link back to A.
Play video starting at 1 minute 19 seconds and follow transcript1:19
Similarly, if you have followers on a social network, it depends on the social network. I mean, in
Facebook, friendships are symmetric. So they're sort of two-directional. That might be an undirected
graph. But on lots of them, you can follow somebody without them following you, and so then you
end up with wanting to have a directed relation for when someone's following someone else.
Play video starting at 1 minute 41 seconds and follow transcript1:41
A final example that we'll look at in some more detail today are sort of dependencies between tasks.
Play video starting at 1 minute 48 seconds and follow transcript1:48
So, we have this directed graph, and we've already built up a lot of this theory that works for
undirected graphs with this sort of exploring DFS algorithms. But most of this sort of actually still
holds for directed graphs. We can still run DFS on a directed graph. The slight modification now is that
when we run our explores, we only want to check for directed edges out of v. So when we say for all
neighbors w of v, if w has not been visited, etc, we really want to say neighbors where v points to w,
not the other way around. And what this will do is it means that when we explore v, we find things
that are actually reachable from v, using the edges in the direction that they're intended. So we're
only sort of allowed to follow these one-way roads in their appropriate direction.
Play video starting at 2 minutes 40 seconds and follow transcript2:40
Now using this new depth first search, we can still compute pre- and post-orderings. They still have
many of the same properties. The algorithm for DFS still runs in linear time. Basically, everything's the
same. The context is now a little bit more general.
Play video starting at 2 minutes 55 seconds and follow transcript2:55
Okay, so let's look at a sort of specific example, where directed graphs are important. In particular,
suppose that we have the following morning routine. We've gotta do a bunch of things every
morning. We need to wake up, we need to get dressed, we need to eat breakfast, we need to go to
work, we need to shower, all of these things.
Play video starting at 3 minutes 14 seconds and follow transcript3:14
We need to do these things in some order, but we can't do them in any old order. Because well, we
need to wake up before we get showered. And we need to dress before we go to work. And we need
to eat breakfast before we go to work, and all of this stuff. And one way of representing these sorts of
dependencies is by a directed graph.
Play video starting at 3 minutes 34 seconds and follow transcript3:34
If we need to do A before we can do B, then we draw a directed edge from A pointing to B. And so,
this gives us some sort of dependency relation. And it's not just these sort of trivial examples of how
do I get dressed in the morning. But if you've got some sort of complicated system of libraries, where
some of them require other ones in order to work, you can end up with some sort of similar graphic
dependencies, which you actually do need similar techniques to handle.
Play video starting at 4 minutes 2 seconds and follow transcript4:02
Okay, so what do we do when we have these dependencies? Well, one of the things that we'd like to
do is we'd like to find the ordering of the tasks in a way that respects these dependencies. We'd like
to wake up before we, well, fine. For example, suppose that we woke up at 7 o'clock and then
showered at 7:05, got dressed at 7:15, had breakfast at 7:20, and went to work at 7:30. This puts all of
our events in some order. And you'll note that this order respects all of our dependencies. We wake
up before we shower, before we get dressed, before we go to work. And we eat breakfast before we
go to work, and everything works out nice.
Play video starting at 4 minutes 40 seconds and follow transcript4:40
And so, if we have one of these dependency graphs, we'd like to linearly order the vertices to respect
these dependencies.
Play video starting at 4 minutes 49 seconds and follow transcript4:49
Now one thing to ask is, is it always possible to do this? And it turns out the answer is no. The sort of
best counter example is this following chicken and egg problem. I mean, the point is that you need a
chicken in order to lay eggs, and you need eggs to hatch chickens. And so you can't like put them in
some order where one of them comes first.
Play video starting at 5 minutes 10 seconds and follow transcript5:10
If you put chickens first, then it would just point you from eggs back to chickens going in the wrong
direction. You put eggs first, there's this pointer from chickens back to eggs in the wrong direction.
Without someplace to have gotten started, there's sort of no way you can get this ordering and go.
Play video starting at 5 minutes 27 seconds and follow transcript5:27
In fact, in general there's a slightly more complicated way in which this can fail to happen. In fact, if
your graph has any cycle, a cycle here is a sequence of vertices, v1, v2 to vn, such that each one
connects to the next using a directed edge. So you've got a bunch of vertices arranged. The circles just
that each one connects to the next one all the way around.
Play video starting at 5 minutes 50 seconds and follow transcript5:50
And the theorem is that if G contains a cycle, it cannot be linearly ordered.
Play video starting at 5 minutes 56 seconds and follow transcript5:56
Okay, so just to make, well, fine. Let's take a look at the proof here. So suppose their graph has a
cycle, v1 through vn, everything connected up in order. And suppose that additionally, we can linearly
order this graph.
Play video starting at 6 minutes 11 seconds and follow transcript6:11
Well, if you linearly order these things, there are finitely many. One of them needs to come first. So
suppose that vk comes first in this order. But now we're putting vk before vk-1, and vk-1 points to it,
so we have an arrow pointing in the wrong direction, which gives us a contradiction. So, if we have a
cycle, we cannot be linearly ordered.
Play video starting at 6 minutes 34 seconds and follow transcript6:34
So, what this means is that in order to be linearly orderable, you need to be what's known as a
directed acyclic graph or DAG. And this is just a name for a directed graph that has no cycles, fair
enough. Now, by the above theorem, it is necessary to be a DAG in order to linearly order. But one
question we should ask ourselves perhaps is, is it sufficient? Can we necessarily linearly order it if it's a
DAG? Well, okay, this is a question to ask a little bit later, but for now, let's just review. We have the
following four graphs. Note that the edges here are sort of the same, except for their orientations.
And which one of these graphs is a DAG?
Play video starting at 7 minutes 20 seconds and follow transcript7:20
Well, it turns out that only A is. B has the cycle noted in red, and C has this other cycle noted in red.
But if you work out A, you can actually see that it does not have any cycles to it. Okay, but the
question that we were posing was, is it the case that any DAG can be linearly ordered? And the
theorem that we'll actually show is that yes, if you have any directed acyclic graph, you can linearly
order it. And next time, what we're going to do is we're going to prove this theorem. But not just that.
In addition to proving this theorem, we're actually going to make it algorithmic. We're going to come
up with an algorithm that, given a DAG, actually produces this linear order. So that's what you have to
look forward to next time, and I'll see you then.

Topological Sort
Hello everybody, welcome back. 
Today, we're going to be talking about the algorithm of a topological sort. 
And we're going to talk about this, we're going to show in fact that 
any DAG can be linearly ordered, and we're going to show you how to do it.
Play video starting at 14 seconds and follow transcript0:14
So remember from last time, we were talking about directed graphs and 
in particular we wanted to be able to linearly order the vertices of this graph. 
Such that every edge pointing to a prome vertex with smaller index under this 
linear ordering something that's larger. 
Now we know that in order for 
this to be possible, our graph needed to have no cycles, it needed to be a DAG. 
And today, we're going to show that this is actually sufficient.

So one way to look at this is if you have this linear ordering, let's think about which vertex comes last
in this order.
Play video starting at 49 seconds and follow transcript0:49
Well if it comes last it can have edges pointing to it, but it can have no edges pointing out of it.
Play video starting at 56 seconds and follow transcript0:56
And so this gives a motivation for useful definition. We say that the source is vertex in a graph that
has no incoming edges. So a source can have as many edges as it likes going outwards, but it can't
have any going inwards.
Play video starting at 1 minute 10 seconds and follow transcript1:10
Similarly a sync is a vertex with no outgoing edges. It can have a bajillion edges coming into it, but
nothing can escape.
Play video starting at 1 minute 18 seconds and follow transcript1:18
So to be clear that we're on the same page, the following graph has nine vertices, how many of them
are sinks?
Play video starting at 1 minute 25 seconds and follow transcript1:25
Well the following highlighted three vertices are sinks. They each have no edges coming out, but
every other vertex in the graph does have a few or at least one edge leaving it.
PPT Slides

So here's the basic idea for how we're going to produce our linear order. We're going to first find a
sink in the graph. There needs to be a sink, because there needs to be something at the end of the
order. And when we have a sink though, we're perfectly fine putting it at the end of our ordering. And
the reason for this is well, it's got edges that point into it, but as long as the vertex comes at the end,
everything pointing to it is coming from before it.
Play video starting at 2 minutes 5 seconds and follow transcript2:05
So once we put it at the end of the order, it's sort of dealt with and we just need to order the rest of
the graph.
Play video starting at 2 minutes 11 seconds and follow transcript2:11
So what we do is remove that vertex from the graph and repeat this process.
Play video starting at 2 minutes 17 seconds and follow transcript2:17
So to see what this means, we have this graph on five vertices. D is a sink so we put it at the end of
our ordering and remove from the graph. Now we find another sink, say C. Put it at the end and
remove from the graph. E is next, goes at the end, then B, then A. And finally we have this ordering, A,
B, E, C, D. And this it turns out, is consistent with the ordering that we had in our original graph.
Play video starting at 2 minutes 44 seconds and follow transcript2:44
Now this is all well and good, but it depends on us being able to find a sink. And before we even ask
that, we should ask how do we even know that there is a sink in our graph.
Play video starting at 2 minutes 55 seconds and follow transcript2:55
And it turns out that for DAG at least, there's an easy way to show that there is, and the idea is the
following. What we're going to do is just start at some vertex v1 and we're just going to follow a path.
We're going to follow directed edges forward and just keep going, finding vertices v2, v3, v4, etc, and
just keep going. Now eventually one of two things will happen.
Play video starting at 3 minutes 17 seconds and follow transcript3:17
Either we could hit a dead end, we could end up at some vertex with just no outgoing edges, we just
can't extend this path anymore. And if that's the case, we found a sink. We found a vertex with no
outgoing edges.
Play video starting at 3 minutes 31 seconds and follow transcript3:31
The second possibility though is that maybe this path just keeps going forever. But if it does since
there are only finitely many vertices in our graph, eventually we'll have to repeat something. We'll
have to find a vertex for the second time. But when we do that it means there was a path that started
at that vertex, went forward a bunch and eventually came back to itself. That means we have a cycle,
and so at least if we're DAG that can't ever happen. So let's take a look at the algorithm. What we're
going to do is, while G is non empty. We're going to follow a path until we can't extend it any further.
PPT Slides
Play video starting at 4 minutes 10 seconds and follow transcript4:10
That vertex at the end is going to be a sink which we'll call v.
Play video starting at 4 minutes 14 seconds and follow transcript4:14
We take the input at the very, very end of our ordering and then remove it from the graph, and then
we just repeat this. So on this graph here, well we started A, we follow a path A, B, C, D. Now we're
stuck. D is a sink, so we can remove it from the graph. We now follow a new path, A, B, C. C is a sink,
we remove it from the graph. A and E, E is a sink, we remove it. Path A, B, B is in sync, so we remove
it. A is already in sync and so we remove it and now we're done. We have our order. Now what's the
run time of this outcome? We need to compute one path for each vertex in the grid and so all of the
many paths. And each path could be pretty long, it include up to all the vertices in the graph. It could
take all of v time.
Play video starting at 5 minutes 3 seconds and follow transcript5:03
So the final run time of this algorithm is O of V squared which is not great. On the other hand we're
doing something a little bit inefficient with this algorithm. We started with this vertex, we followed
this huge long path until we found the sync and then we remove it from the graph.
Play video starting at 5 minutes 23 seconds and follow transcript5:23
Then what we probably did was we started that same vertex again, followed basically the same path
until we got almost to the very end, until we get to right before that vertex that we removed.
Play video starting at 5 minutes 34 seconds and follow transcript5:34
And we're sort of repeating all of this work. Why do we need to follow this whole path again? We
could just back up one step along the path and then keep going from there, and so that's what we're
going to do. Instead of retracing our entire path, we're just going to back up one step along our path.
So to see what this does in our example, we go A, B, C, D. D is a sink which we can remove, but now
instead of starting back at A, we start at C which is already a sink. And B which is already a sink now at
A we have to follow the path E to could find the next sink and then we're done. And this algorithm
reuses our steps a lot less and so it's a lot more efficient.
Play video starting at 6 minutes 18 seconds and follow transcript6:18
In fact, we think about with this algorithm does, it's basically just adopt for search. We're starting in a
vertex where following a path for until we can't anymore. And then finally, once we're stuck, we turn
around and we sort of, stop using that vertex again.
Play video starting at 6 minutes 35 seconds and follow transcript6:35
Then from that one step previous we just keep going forward again. This is exactly the depth for
search ordering. And in particular when ever we finish the post visit block at a vertex, we put it at the
end of our ordering. So the order in which we're sorting our vertices is just based on the post order. In
particular, vertices with small post order go at the end of our ordering, and ones with large post order
go at the beginning.

Ppt slides
So the algorithm is now super easy. We run depth for search on the graph and then we sort the
vertices based on the reverse post order. And note, we really don't actually have to do any sorting
here. We just have to like, remember the order in which we left our vertices. So that's our algorithm,
let's take a look at the correctness.
Play video starting at 7 minutes 23 seconds and follow transcript7:23
So basically what we're saying is that if we got g as a DAG. Then we want to say, we can order them
by the reverse post order. Now for this to be consistent with our graph ordering, it needs to be the
case that whenever we have an edge from u to v, u comes before v which means that the post of u is
bigger than the post of v. Let's prove this. There are three cases to consider. First, that we could
explore u before we explore.
Play video starting at 7 minutes 52 seconds and follow transcript7:52
Next, we could explore v while were exploring u.
Play video starting at 7 minutes 57 seconds and follow transcript7:57
And finally, we could explore v after were done exploring. Of course, this last case can't happens is
there is edge u to v which means that if we haven't already visited v we will visit as part of our
exploration view.
Play video starting at 8 minutes 13 seconds and follow transcript8:13
But okay the first two case is still possible let's take a look.
Play video starting at 8 minutes 18 seconds and follow transcript8:18
Firstly, if we explore v before we explore u, well it turns out that we can't reach u from v, and this is
because we have a DAG. If there was a path from v to u, and then we add an edge to u back to v. That
would give us a cycle which we can't take. So it must be the case therefore, that we can't discover u
as part of exploring v. We have to finish exploring v before we can even start exploring u, and that
tells the post of u is bigger than the post of v, which is what we wanted to show in this case.
Play video starting at 8 minutes 53 seconds and follow transcript8:53
Now in the second case the analysis is a little bit different. If we explore v while exploring u remember
that this means that our explore of v is a subroutine of exploring u, and therefore it needs to finish up
first. And that again, tells us that the post of u is bigger than the post of v.
Play video starting at 9 minutes 10 seconds and follow transcript9:10
So that completes the proof of our theorem and shows that this algorithm for topological sort actually
works. So we've got now this nice algorithm for the sorting. Next time, we are going to talk about
something a little bit different, we're going to talk about connectivity in digraphs. So I'll see you then.

Strongly Connected Components


Hello everybody, welcome back to our graph algorithms course. Today we're going to be talking about
notions of connectivity in directed graphs which it turns out will be a little bit more complicated than
those in the undirected case. And we'll talk about various notions then get to this idea of a strongly
connected component.
Play video starting at 19 seconds and follow transcript0:19
So in undirected graphs, connectivity is a pretty simple notion. We have these connected components
where any two things in the same component you can get from one to the other. And if you've got
two different components, they've got no edges between them. Nothing connects at all.
Play video starting at 35 seconds and follow transcript0:35
In directed graphs the story is a bit more complicated.
Play video starting at 38 seconds and follow transcript0:38
So let's consider the following example graph. Now in one sense this graph is connected, you can't
separate it out into two islands such that you can't reach one island from the other.
Play video starting at 50 seconds and follow transcript0:50
On the other hand if you want to look at which vertices can be reached from which others, the story is
a bit more complicated.
Play video starting at 56 seconds and follow transcript0:56
From vertex D it turns out you can reach every other vertex in this graph. However, D is a source for
attacks. Once you leave D, there's no way to get back to it, there are just no edges that come into it.
Play video starting at 1 minute 9 seconds and follow transcript1:09
Now A just sort of, sorry, A F is sort of the opposite here. From every vertex in the graph you can
reach F, but once again, once you do, you can't get back. Then again, G and A, neither of them can be
reached from each other. H and I, you can get between them but once you leave, again, you can't get
back. It's sort of a lot more complicated if we want to figure out which things are reachable from
which others.

Play video starting at 1 minute 35 seconds and follow transcript1:35


Now, there are actually a bunch of possible notions of connectivity that show up in directed graphs.
Play video starting at 1 minute 41 seconds and follow transcript1:41
The first one is sort of that these two points can't be separated from each other, that you can't put
them on two different islands where nothing in one island connects to anything in the other.
Play video starting at 1 minute 52 seconds and follow transcript1:52
And this sort of says that it means you can get from one vertex to the other, following edges in any all
direction.
Play video starting at 1 minute 58 seconds and follow transcript1:58
And I mean, okay, this says you can't put them on different islands, but it doesn't say you can reach
either from the other one without breaking traffic laws by following one-way streets in the wrong
direction.
Play video starting at 2 minutes 10 seconds and follow transcript2:10
So, a second notion is that maybe we want one of the vertices to be reachable from the other using
the actual edges and their intended direction. And this is a more practical notion. However, it has its
weird irregularities that makes it hard to deal with. So, third notion that's maybe a little bit stronger is
that you should have two vertices, v and w, where not only can you reach w from v, but you can also
go back and reach v from w.
Play video starting at 2 minutes 37 seconds and follow transcript2:37
And this third notion turns out to be pretty nice.
So when we say that two vertices v and w in a directed graph are connected, we mean this. We mean
that you can reach v from w, and also can reach w from v. But the theorem now is that using this
notion we actually recover much of the power that we had in the undirected case. A directed graph
can always be partitioned into strongly connected components where two vertices are in the same
strongly connected component, if and only if they are connected to each other.
Play video starting at 3 minutes 12 seconds and follow transcript3:12
So, for example, the graph that we looked at has five strongly connected components. After you can
get it all around around there, but there's no way to get from it to anything else. H and I you can get
from one to the other but not necessarily to other things. A, B, C and E from any of those four vertices
you can reach any other one, but once you leave again you can't get back.
Play video starting at 3 minutes 35 seconds and follow transcript3:35
D and G are each their own components and that's it. Within each of these components you can get
from everywhere to anywhere else. But again once you leave it, you can't come back.
A

This should be selected 


is not selected.This is wrong. It should be selected.

This should be selected 


is not selected.This is wrong. It should be selected.

Un-selected is correct 
is not selected.This is correct.

This should not be selected 


is selected.This is wrong. It should not be selected.
E

This should be selected 


is not selected.This is wrong. It should be selected.

This should be selected 


is not selected.This is wrong. It should be selected.

This should be selected 


is not selected.This is wrong. It should be selected.

This should be selected 


is not selected.This is wrong. It should be selected.

Un-selected is correct 
Play video starting at 3 minutes 47 seconds and follow transcript3:47
So to make sure we are on the same page, we have the following graph. What is the strongly connect
component of A in this graph?
Play video starting at 3 minutes 56 seconds and follow transcript3:56
Well, it turns out to be the following set. A, B, E, F, G, and H are in this component. The other ones
aren't. And to see this, we can see that sort of you can actually get around any of these six. A goes to
E goes to H goes to F goes to B goes back. B goes to G goes to H goes to F goes back to B. Sort of you
can use these to navigate around, everywhere you like. But from D you can't get to D from this other
component. And C and I, once you get there you can't get back to the other vertices here.
Play video starting at 4 minutes 27 seconds and follow transcript4:27
So the result is that you have a directed graph. It can always be partitioned into strongly connected
components. Such that two vertices are connected if and only if they're in the same component.
Play video starting at 4 minutes 39 seconds and follow transcript4:39
And the proof of this is very similar to the proof of the sort of similar thing in the undirected case.
Again, we need to show that connectivity in the strong sense is an equivalence relation. That if u is
connected to v, and v is connected to w, then u is connected to w.
Play video starting at 4 minutes 56 seconds and follow transcript4:56
But while if these connections happen, there's a path from u to v, and a path from v to w. And so
pushing them together, there's a path from u to w.
Play video starting at 5 minutes 6 seconds and follow transcript5:06
Similarly, there's a path from w back to u, and from v back to, sorry, w back to v and v back to u. And
composing them again gives you a path from w back to u. And so that completes our proof.
Play video starting at 5 minutes 20 seconds and follow transcript5:20
Now there's something more to say though. Once we've split our graph into connected components,
they still have edges connecting these components to each other.
Play video starting at 5 minutes 30 seconds and follow transcript5:30
So one useful thing to do is to draw what's known as the metagraph, which sort of tells us how these
connected components connect to each other. So this metagraph has vertices that correspond to the
connected components of the original graph. And you have an edge between the two of them if
there's an edge in that direction connecting those vertices. So D connects to A, so there's an edge
between D and the component of A. It also connects to G, A, B, C or E, those guys have edges leading
to H including F and so they have edges into those components and so on and so forth. And this is our
metagraph.
Play video starting at 6 minutes 9 seconds and follow transcript6:09
Now one thing to note about this metagraph is if you stare for a little while, you can actually see that
it's a deck and this is in fact no coincidence. It turns out that the metagraph of any graph G is always a
DAG.
Now the proof of this is not so hard, suppose that it's not. What that means is that there's a cycle.
Play video starting at 6 minutes 29 seconds and follow transcript6:29
And what happens if you have a cycle is that every node in the cycle can be reached in every other
node because you just travel along the cycle until you get there.
Play video starting at 6 minutes 39 seconds and follow transcript6:39
Now, this would be true if they were individual vertices, but it's also true if you have a cycle between
strongly connected components. Because you just, since everything in one component connects to
everything else, you can just get to the guy leading to the next component, and then to the vertex
leading to the next component, and this sort of follows all the way around the cycle. And so
everything in all of those components are all connected to each other. But since everything is
connected, they should all be the same strongly connected components. However the single vertices
of the metagraph are strongly connected components, so you can't have two vertices that are
connected to each other and that gives us a contradiction.
Play video starting at 7 minutes 19 seconds and follow transcript7:19
Okay, so in summary, we can always partition our vertices into strongly connected components.
Play video starting at 7 minutes 25 seconds and follow transcript7:25
We have a metagraph, describes how these strongly connected components connect to each other,
and this metagraph is always a DAG.
Play video starting at 7 minutes 34 seconds and follow transcript7:34
So, this is what we have. Next time what we're going to do is we're going to actually talk about how to
compute these things. How do we compute the strongly connected components of the graph, how do
we find the metagraph? So, come back for the next Lecture and we'll talk about that.

Computing Strongly Connected Components


Hello, everybody. 
Welcome back to our Graph Algorithms course. 
Today, we're going to talk about how to get an algorithm 
to efficiently compute the strongly connected components of a directed graph.
Play video starting at 11 seconds and follow transcript0:11
So if you recall from last time, 
what we had was we find this notion of connectivity in directed graphs, 
where two vertices were connected if you could get from one to the other and back.
Play video starting at 21 seconds and follow transcript0:21
Now, we said this graph necessarily would be divided into strongly connected 
components, where within a component, you could get from everything 
to everything else, but sort of once you leave the component you can't get back.
Play video starting at 35 seconds and follow transcript0:35
Now, these components are connected to each other by what we call the metagraph. 
And the metagraph was always a DAG, 
which is sort of a useful a thing as we'll see today.
So the problem we're going to look at today is, given a graph, G, a directed graph, G, how do we find
the strongly connected components of G?
Play video starting at 55 seconds and follow transcript0:55
Now there's a pretty easy algorithm for this, it turns out. For each vertex v, run explore on v and
determine all the vertices reachable from v. Once you've done that for every vertex, you know that
for a vertex v what you want to do is find all the vertices u that are both reachable from v and can
also reach v. And that, it turns out, will give you the strongly connected component of v. And so this
gives you the strongly connected component of v, you run this for all v, that gives you all the strongly
connected components. And so the runtime of this algorithm is a little bit slow but you need to
explore from every single starting vertex. So the run time is something like o of v squared plus v times
e.
Play video starting at 1 minute 39 seconds and follow transcript1:39
This is okay but we'd like to find something faster.
Play video starting at 1 minute 44 seconds and follow transcript1:44
And what's the idea? The key idea of this algorithm is the following. If you take a vertex v and run
explore, you find everything you can reach from v. Now this includes the components of v.
Play video starting at 1 minute 57 seconds and follow transcript1:57
But if there are other components downstream of that, you might find vertices from other connecting
components as well.
Play video starting at 2 minutes 5 seconds and follow transcript2:05
However, there's a case when this doesn't happen, if v is located in a sink strongly connected
component. That means there are no edges out of that strongly connected component. So when you
explore from v, you will find exactly it's strongly connected component. Which is good, because we
want to find the strongly connected component. So, if you actually find one, that's a good start.
Play video starting at 2 minutes 29 seconds and follow transcript2:29
So how do we do this? Well, we need to find the vertex in a sink strongly connected components
which takes some thoughts.
Play video starting at 2 minutes 38 seconds and follow transcript2:38
Well, there's a theorem it turns out that will help. If C and C prime are two strongly connected
components where there's an edge from some vertex of C to some vertex of C prime. It turns out that
the largest post number in C Is larger than the largest post number in C prime.
Play video starting at 2 minutes 57 seconds and follow transcript2:57
Now to prove this, we split into two cases. When we run our depth first search, either we visit a
vertex in C before we visit a vertex in C prime, or it could be the opposite way, visit a vertex in C prime
before we visit a vertex of C.
Play video starting at 3 minutes 11 seconds and follow transcript3:11
In the first case, where we visit C first, well, from a vertex in C you can reach everything else in C,
because it is the same component. There's also an edge from C to C prime, so you can also, it turns
out, reach everything in C prime from that vertex.
Play video starting at 3 minutes 26 seconds and follow transcript3:26
And that means that while you're still exploring that first vertex in C, you actually explore everything
else in C and everything in C prime as a subroutine.
Play video starting at 3 minutes 37 seconds and follow transcript3:37
This means, that because of the way subroutines work, you have to finish exploring all those vertices
before you finish that last vertex in C. So, that one vertex in C has the largest post number of any
vertex in either component.
Play video starting at 3 minutes 52 seconds and follow transcript3:52
The second case is you visit C prime first. Here the proclaim is that you can't actually reach C from C
prime. Because if you could reach C from C prime, and you still follow the edge from C back to C prime
and you'd have a cycle. And this is not just any cycle. This is a cycle in the metagraph, which you can't
have because the metagraph must be a DAG. And this means that you can't reach C from C prime.
When you explore C prime, you'll never actually find C in the middle of that exploration. So, in fact,
you have to finish exploring C prime before you can even begin exploring C. And so, once again, the
vertex with the largest post has to be in C.
Play video starting at 4 minutes 34 seconds and follow transcript4:34
Okay, so what does this mean? If you look at the large vertex with the single largest post order
number in the entire graph, what can we say about that?
Well it has to come from a component with no other components pointing to it. That vertex needs to
be the source component, which is great. It's almost what we wanted. What we wanted was a vertex
in a sink component. Well, there's a trick for doing this. So if you have a graph, we're going to define
what's called the Reverse Graph, which is just what you get by taking a graph reversing the direction
of all the edges. So if a graph on the left is G, the graph on the right is the reverse graph. The edges
are all the same, they're just pointing in the opposite directions. Now the cute thing here is that the
reverse graph and G both have the same strongly connected components. I mean, v and w are
connected if you can go from v to w and w back to v. But if you reverse all the edges, you can just
follow those paths in the opposite directions, from v to w and w back to v.
Play video starting at 5 minutes 40 seconds and follow transcript5:40
And so, the strongly connected components are the same, but the edges in the metagraph are
different. Because if you have a source component in the reverse graph, that means that you have
edges coming out of it but not into it. Well, when we reverse the edges to get the original graph G,
the edges come into it and not out, and you find the sink component.
Play video starting at 6 minutes 1 second and follow transcript6:01
So in order to find the vertex in the sink component of G, what you do is you run depth first search on
the reverse graph, which by the way is easy to compute, you just take every edge and reverse the
direction. because we run depth first search on the reverse graph and take the vertex to largest post
order.
Play video starting at 6 minutes 22 seconds and follow transcript6:22
Okay so just to review. Which of the following is true? The vertex with the largest post order number
in the reverse graph is in the sink component?
Play video starting at 6 minutes 30 seconds and follow transcript6:30
The vertex with the largest preorder number is in the sink component?
Play video starting at 6 minutes 34 seconds and follow transcript6:34
Or the vertex with the smallest number is in the sink component?
Play video starting at 6 minutes 41 seconds and follow transcript6:41
Well, all of these sound roughly equivalent. But if you work them out, only one of them is true. The
first one. The vertex with the largest postorder number is in a sink component. The other ones sound
plausible, but just don't work.
PPT Slides
Play video starting at 6 minutes 57 seconds and follow transcript6:57
Okay, but this gives us an algorithm. And the point is the following. We run depth first search on the
reverse graph. We let v be the vertices of the largest post number and this has to be in the sink
component of the G. We now explore v and because it was in a sink component, the vertices we find
are actually a strongly connected component of G.
Play video starting at 7 minutes 19 seconds and follow transcript7:19
So, we take them, we remove them from G and then we repeat.
Play video starting at 7 minutes 24 seconds and follow transcript7:24
So, here's our graph, we depth first search the reverse graph, the largest post number is that 18. So,
we explore that, we find one vertex. That's our first component. We're now going to remove that
from the graph and try again.
Play video starting at 7 minutes 38 seconds and follow transcript7:38
Depth first search on the reverse graph, 16 is the largest post. We explore from there, we find this
new component. Great. Depth first search the reverse graph, find the largest post, explore that. We
have a third component. Depth first search the reverse graph, 10 is the largest, and we explore from
that. We find these four guys as a component, and then there's the one vertex we find out. So that
gives us a strongly connected components.
Play video starting at 8 minutes 6 seconds and follow transcript8:06
Unfortunately, this algorithm's a little bit inefficient again. Because we need to run depth first search
repeatedly on the reverse graph.
Play video starting at 8 minutes 14 seconds and follow transcript8:14
But it turns out that that's actually unnecessary. Because remember, the theorem that we had was
that if you had an edge between two components, the one that was further upstream always had the
larger post order.
Play video starting at 8 minutes 28 seconds and follow transcript8:28
In fact, when we reversed the edges, now it's the one downstream has the largest post order, but
whatever. But the point is that after you remove the sink component, and if you just then look for the
vertex with the single largest remaining post order, that's going to be a sink component of the new
graph. It doesn't point to anything else, except for maybe some components that you've already
removed.
Play video starting at 8 minutes 53 seconds and follow transcript8:53
And so the new algorithm, we just run depth first search once on the reverse
Play video starting at 8 minutes 58 seconds and follow transcript8:58
graph, then we look for v in the graph in reverse postorder.
Play video starting at 9 minutes 2 seconds and follow transcript9:02
Any v that has not yet been visited, we explore that v, and mark the vertices that we found as a new
strongly connected component.
Ppt slides
Play video starting at 9 minutes 11 seconds and follow transcript9:11
So we have this graph. We depth first search on the reverse graph. We record the post numbers. Now
the largest post is 18. We explore that vertex, find this component.
Play video starting at 9 minutes 22 seconds and follow transcript9:22
Next is 17, which we explore and find this component.
Play video starting at 9 minutes 25 seconds and follow transcript9:25
Then 15 finds this guy, 10 finds this 4, and 6 finds the last guy, and that's it. A much faster algorithm.
In fact, the runtime, we essentially just ran a depth first search on the reverse graph and then ran
another depth first search on G, which is the slight modification that we wanted to visit our vertices in
this outer loop in some specific order. And also, we need to record the connected components we
found.
Play video starting at 9 minutes 54 seconds and follow transcript9:54
But basically, this is just two depth first searches, the runtime is linear O of V plus E, and this gives us a
nice efficient algorithm for finding our strongly connected components.
Play video starting at 10 minutes 5 seconds and follow transcript10:05
Well, that completes our unit on this graph exploration and decomposition algorithms. These tell us
how to find ways to get from one vertex in the graph to another. However, when you're actually
trying to solve this problems in practice, you don't just want any route on your map that gets me from
where I am in San Diego to say, Los Angeles. The route that I find might pass me through New York
and Florida on the way, and I don't want to do that. What I really want to do is I want to find an
efficient path.
Play video starting at 10 minutes 37 seconds and follow transcript10:37
How do I get there which spends as little time as possible, or maybe as little money as possible. And
this is what we're going to start talking about in the next unit where Michael is going to be talking to
you about how to find shortest paths in graphs.
Play video starting at 10 minutes 52 seconds and follow transcript10:52
So, I hope you enjoyed this unit and will come back for the next one.

Slides and External References

Slides
09_graph_decomposition_8_strongly-connected-components.pdfPDF File
09_graph_decomposition_9_computing-sccs.pdfPDF File
09_graph_decomposition_7_topological-sort.pdfPDF File
09_graph_decomposition_6_dags.pdfPDF File

Reading
Section 3.3 and 3.4 in [DPV]

Visualizations
 Topological sort using depth first search by David Galles
 Topological sort using indegree array by David Galles
 Strongly connected components by David Galles

References
[DPV] Sanjoy Dasgupta, Christos Papadimitriou, and Umesh Vazirani. Algorithms (1st Edition).
McGraw-Hill Higher Education. 2008.

Programming Assignment: Programming Assignment 2:


Decomposition of Graphs
You have not submitted. You must earn 2/3 points to pass.

Deadlin Pass this assignment by Jul 19, 11:59 PM PDT


e

1. Instructions
2. My submission
3. Discussions
Welcome to your second programming assignment of the Algorithms on Graphs class! In this
assignment, we focus on directed graphs and their parts.

(The instructions and starter files can be found in the first week programming assignment archive
file.)

Week 3
Algorithms on Graphs
Week 3
Discuss this week's modules here.

26 threads · Last post 24 days ago


Go to forum

Paths in Graphs 1

In this module you will study algorithms for finding Shortest Paths in Graphs. These algorithms
have lots of applications. When you launch a navigation app on your smartphone like Google
Maps or Yandex.Navi, it uses these algorithms to find you the fastest route from work to home,
from home to school, etc. When you search for airplane tickets, these algorithms are used to find
a route with the minimum number of plane changes. Unexpectedly, these algorithms can also be
used to determine the optimal way to do currency exchange, sometimes allowing to earh huge
profit! We will cover all these applications, and you will learn Breadth-First Search, Dijkstra's
Algorithm and Bellman-Ford Algorithm. These algorithms are efficient and lay the foundation for
even more efficient algorithms which you will learn and implement in the Shortest Paths Capstone
Project to find best routes on real maps of cities and countries, find distances between people in
Social Networks. In the end you will be able to find Shortest Paths efficiently in any Graph. This
week we will study Breadth-First Search algorithm.
Less
Key Concepts
 Explain what a shortest path is
 Describe algorithms for computing shortest paths in undirected graphs
 Create a program for finding an optimal flight

Less

Most Direct Route

Video: Lecture Most Direct Route

9 min

Resume

. Click to resume

Video: Lecture Breadth-First Search

7 min

Video: Lecture Breadth-First Search (continued)

6 min

Video: Lecture Implementation and Analysis


9 min

Video: Lecture Proof of Correctness

7 min

Video: Lecture Proof of Correctness (continued)

5 min

Video: Lecture Shortest-Path Tree

6 min

Video: Lecture Reconstructing the Shortest Path

3 min

Reading: Slides and External References

10 min
Programming Assignment

Purchase a subscription to unlock this item.

Programming Assignment: Programming Assignment 3: Paths in Graphs

3h

Due Jul 26, 11:59 PM PDT

Survey

Survey

10 min

Most Direct Route


Hi, in this module you will study shortest paths in graphs and algorithms to find them. These
algorithms have lots of applications. For example, when you want to go from one city to another, you
don't want to switch your transport many times. And in the travel planning systems, there are
algorithms that help you to minimize the number of such switches, for example, to get from Hamburg
to Moscow with the minimum possible number of flight segments.
Play video starting at 29 seconds and follow transcript0:29
When you start a navigation app on your smartphone to get home from work faster, one of these
algorithms is used to give you the fastest route possible. Also, one of these algorithms is used right
now to direct the network packets with data through the Internet so that you can watch this video
online.
Play video starting at 48 seconds and follow transcript0:48
In this module, we will start with the most direct route problem about flight segments. We will then
consider the problem of getting from point A to point B with the fastest possible route. And in the
end, we'll consider the problem about currency exchange, which doesn't seem to be a problem about
graphs or shortest paths, but actually the same algorithms will help you to exchange currency in the
optimal, most profitable way.
Play video starting at 1 minute 17 seconds and follow transcript1:17
Let's start with the most direct route problem. It is formulated very simply. What is the minimum
number of flight segments to get from one city to another? For example, if we look at this map you
could go from Hamburg to Moscow with five flight segments, but this is obviously not optimal
because you could go from Hamburg to Moscow with a direct flight. We can consider it as a graph
problem on a graph where nodes correspond to cities and directed edges correspond to available
flights from one city to another. The edges are directed because flights can be available one way and
not available another one, for example, because there are no tickets left. And, of course, in the typical
real world graph there are many more cities and many more possible flights but this is just an
illustrative example. So here the graph problem is to get from node correspondent to Hamburg to
node correspondent to Moscow. And one way to do that is to use these five edges corresponding to
the five flight segments we saw on the map, but this is obviously not optimal in terms of minimization
of number of edges. For example, we could get from Hamburg to Paris first, and then from Paris to
Moscow, and that would be just two edges. Or in this case, you could go just directly from Hamburg
to Moscow, and obviously this is the optimal way, to use just one flight segment, which is not always
possible. For example, to get from Hamburg to Helsinki, you'd need at least three edges on this graph.
Play video starting at 2 minutes 52 seconds and follow transcript2:52
Also notice that as soon as we formulate it as a graph problem, we don't need to name the nodes like
Hamburg or Moscow. We can just name them to A, B, C, D, E, and then solve the problem on an
abstract graph. So let's talk about paths and graphs.
So, we define length of the path L(P), where P is the path, as the number of edges in this path. For
example, if we consider the path from D to B, which consists of edges from D to E, then from E to S,
from S to A, and from A to B, then the length of this path is 4. And note that this is an undirected
graph, but we will also look at the directed example soon. Another example of a path is from D to B
again, but through S. From D to S to C to B, and the length of this path is 3. Now the distance between
two vertices or nodes in the graph is the length of the shortest possible path between them. For
example, the distance between D and B is just 3 because this is the path from D to B which contains
just three edges, and there are no paths that contains less than three edges.
Play video starting at 4 minutes 14 seconds and follow transcript4:14
And the distance from C to A is just 2, because there is a path through S. There is another one through
B, which is also of length 2, but there is no direct edge from C to A. The situation changes a little bit
when the graph is directed because not all edges can be taken. And so distance from D to B in this
graph, which looks like the same but has directed edges, is now 4 instead of 3 in the undirected case.
This is because we cannot take the edge from D to S because it is directed in the wrong direction. So
distance from D to B is 4 because there is this path in green and there is no shorter path.
Play video starting at 4 minutes 58 seconds and follow transcript4:58
And the distance from C to A in this case is infinity because there is actually no path from C to A,
because you cannot go from C to F. And even if you go from C to B, it cannot go from B to A because
the edge is going the wrong direction.
Ppt slides very important

Play video starting at 5 minutes 15 seconds and follow transcript5:15


Now let's consider the paths from some origin node S. It turns out that to find the shortest path from
A to B is not simpler than finding shortest paths from A to all other nodes in the graph, or at least
almost all the other nodes in the graph. And so we will be studying the problem of finding shortest
path from one starting node S to all other nodes in the graph. When we select the starting node, we
can lay out the same graph in a different way using distance layers. In this case, we have three
distance layers. The first one is layer 0, which contains just node S, which is a distance 0 from itself.
Then we have layer 1, which contains four nodes a distance 1 from node S. And then we have layer 2,
which contains just one node, B, which is a distance 2 from node S.
Play video starting at 6 minutes 10 seconds and follow transcript6:10
If we added another node to this graph, for example, node F, which is only connected with B, then it
will have 4 layers, and additional layer 3 contains just node F, which is a distance 3 from node S. Note
that there cannot be an edge from D to F in such layered graph because otherwise there would be a
path from S to F of length 2, S, D, F, and then F would be in the layer 2 and there would be no layer 3.
And another example is that there cannot be an edge from S to F because in this case the distance
from S to F would be 1 and so F would be in the layer 1. And again there wouldn't be no layer 3. And
the general property is that the only edges which are allowed in such layered graph are edges inside
the layer, like edge from D to E in this example, and edges between a layer and the next layer, like
edges from S to layer 1, edges from A and C to B, and edge from B to F.
Play video starting at 7 minutes 15 seconds and follow transcript7:15
Now let's consider the directed case.
Play video starting at 7 minutes 17 seconds and follow transcript7:17
We directed all the edges downwards and so the layers didn't change. We still have layer 0 with just S,
layer 1 with four nodes, and layers 2 and 3 with nodes B and F correspondingly. What about
additional edges here?
Play video starting at 7 minutes 34 seconds and follow transcript7:34
For example, can F be connected to D? So yes, it can be connected to D, but in this direction, from F
to D, because this edge from F to D doesn't give us any shorter path from S to F. And so the shortest
path is still of length 3. Also, it is possible to connect F directly to S in this direction, or to connect B
directly to S because the shortest paths don't change. However, it is not possible to add an edge from
C to F, because if we add this edge, the layer structure will change. because the distance from S to F
will be just 2, from S to C and from C to F. And the general property is that there cannot be any edge
from a layer to another layer which is farther from S by at least two. So there can be an edge from a
layer to the next layer. There can be an edge within the layer's edges, an edge from D to E in this
example. And there can be an edge from a layer to any of the previous layers, such as the green edges
in this example. But there cannot be any red edges, such as an edge from C to F, which is an edge
from the layer 1 to the layer 3, which is at least farther by two from S than layer 1. So there cannot be
any such edges in the layered distance graph. And in the next video, we will discuss an efficient
algorithm to traverse the graph layer by layer so that in the end every node is assigned to some layer,
and we know the distance to this node as the number of the layer to which it was assigned.

Breadth-First Search
Ppt slides
Hi, in this video we will discuss breadth-first search an efficient algorithm to find distances from an
origin node to all the nodes in the graph. And we will use the layered distance structure of the graph
from the previous video. We will start from the version of the algorithm which is easier to
understand. We marked as S a blue node which will be our origin, and we will find distances from this
origin to all the nodes in the graph.
Play video starting at 27 seconds and follow transcript0:27
We'll start with a state when all the nodes are colored in white.
Play video starting at 32 seconds and follow transcript0:32
And the node colored in white means that this node hasn't yet been discolored. During the algorithm,
we'll discover some nodes, and then we will process them. When we discover the node we cover it
with grey. At first, we discover the origin node S.
Play video starting at 49 seconds and follow transcript0:49
And when we process the node we cover it with black.
Play video starting at 54 seconds and follow transcript0:54
And of course after discovering the origin node, we start processing.
Play video starting at 1 minute 0 seconds and follow transcript1:00
We will process the graph layer by layer. We'll start with layer zero, which only consists of the node S
and we'll start processing it. When we start processing a layer, we take all of the edges outgoing from
this layer and we discover all of the nodes ends of those edges.
Play video starting at 1 minute 19 seconds and follow transcript1:19
So basically when we start from layer zero, we discover the whole layer one because the edges from
the layer zero node, S, go to layer one nodes. Of course there could be an additional edge from S to
itself, but then we would ignore this edge because it goes to the node we've already discovered. In
this case, there is no such edge, so all the ends of the edges going from S are new nodes, white nodes,
which are not yet discovered and we color them with gray to mark the fact that we have discovered
them. After we have discovered all of them, all the layer one we start processing all those nodes
simultaneously. So we process all six nodes of the layer one. And to do that we take all the edges
outgoing from those nodes and discover the new nodes at the ends of those edges. You may notice
that there are a few red edges. And those are the edges which go from the nodes in layer one, but
which go to the nodes which were already discovered previously to S and to the nodes of the same
layer one. And there are a lot of bold black edges, which go also from the nodes of the layer one,
which we are not processing, but they go to the new nodes, which were not discovered previously.
And we mark those nodes with gray.
Play video starting at 2 minutes 49 seconds and follow transcript2:49
And when we process edges from the layer one we get all the nodes of the layer two, because the
edges from layer one go to layer two to the same layer and to the previous layers. When edges go to
the same layer or to previous layers, we mark them with red and we don't do anything with the nodes
on the ends of those edges. And the edges go to the next layer, we discover nodes of the next layer.
And of course, all the nodes of the next layer, of the layer two, have some incoming edge from layer
one. So after processing layer one, we've discovered the whole layer two.
Play video starting at 3 minutes 29 seconds and follow transcript3:29
And after we've discovered it, we start processing it. So we process the outer circle, all the nodes
from the layer two. And to that we consider all their edges which are outgoing from them. And you
see that there are only red edges because all the edges from layer two go to either the same layer
two or to S or to layer one nodes. And all of them have been discovered before that, so we don't do
anything with those nodes. So we don't discover any new nodes. And it means that we don't have
anything new to process, so our algorithm stops with that.
And we now can assign each node to a layer. Obviously node s, origin node, is in the layer 0. The
nodes we discovered on the first step are the nodes of the letter 1. So, they're at the distance 1 from
S. And the nodes discovered on the next step are nodes of the letter 2, and they're at distance 2 from
S. And so, we've found distances from S to all the nodes in the graph. But actually not to all of them,
because there can be some nodes which are not connected to S, which are not reachable from S. And
such nodes must have infinite distance from S. And we'll solve this issue by initializing all the distances
from S to all other nodes with infinity. And then, setting the distance from S itself to 0 and
implementing the algorithm that I've just told you. Then every node, which is reachable from S, will
get some finite distance from S, and all the nodes which are unreachable from S, will stay with their
infinite distance.
Play video starting at 5 minutes 17 seconds and follow transcript5:17
Now let's look how this same algorithm will work with an undirected graph. So we have basically the
same graph, we just removed the arrows from the edges. So all the edges became undirected, and
again we have the same origin S, marked with blue. We start with all the nodes being white, because
they are not discovered yet, we discover node S, and color it with grey. And then we start processing
it, and color it with black to process it without take all the edges outgoing from it. And you see that
now there are more edges outgoing from S, because some of the edges which are incoming into S in
the previous example, now are also outgoing from S because they are undirected edges. So, we
discovered seven nodes instead of six, as in the previous example. And after we discovered all these
seven nodes, this is our layer one. These are all the nodes of distance one from S. We discovered
them. Now, we start processing them. And to do that, we consider all the edges out going from those
seven nodes. All black ones, but for S.
Play video starting at 6 minutes 25 seconds and follow transcript6:25
And you see that some red edges appear. Those are edges inside layer one, and edges from layer one
to S, and there are a lot of bold black edges. Those are the edges from the nodes of layer one to the
new nodes, which are not discovered before. And we've discovered almost all the nodes of the outer
circle, but for the one in the bottom which was discovered before. So we have discovered 11 new
nodes. We have discovered them, and now we need to process them. And to do that, we consider all
the edges outgoing from those 11 nodes. And all of those edges are red because they all go to the
nodes we previously discovered. So nothing new is discovered and we stop our algorithm. And now
we can assign distances. And again, distance from S to itself is zero. Distance from S to the nodes
discovered in the first step is one, this is layer one. And distance from S to the 11 nodes in the outer
circle is two. This is our layer too. And again we've found all the distances from S.
Play video starting at 7 minutes 34 seconds and follow transcript7:34
And we have found all the layers, and also if there are some nodes which are not connected with us
initially, we initialize the distances to infinity. And so, they stay with this infinite distance to them.

Breadth-First Search (continued)


This is basically how the algorithm works, and 
this is more or less clear that it defines distances 
correctly because it just goes through the graph layer by layer. 
But to actually implement the algorithm, we need to do everything turn by turn. 
We cannot just take a couple of nodes and process them simultaneously. 
We need to have some order on those nodes. 
And now let's solve this problem. 
So we return to our initial example of a directed graph with S origin. 
And now we want to process each node one by one. 
To achieve that, instead of processing each layer of nodes simultaneously, 
we will have a queue of notes. 
So the nodes will get into the queue and wait for their turn. 
And as soon as every node which was in the queue before this node has already been 
processed, this node goes out of the queue, and it is being processed. 
So, when we discover node, we put it into a queue. 
And when we need to process it, we take it from the queue and process it. 
And it means that the nodes which were discovered earlier 
will also be processed earlier. 
And so, in general the order of layers won't change because 
first layer 0 nodes will get into the queue, 
then all the nodes from layer 1 will get into the queue. 
And then after they all are processed, we'll process nodes of layer 2, 
which were discovered after notes of layer one. 
Now let's see how all it works. 
So we discovered node S, and 
we already know by that time that the distance to S is 0. 
It is layer 0 and there are no more nodes in the layer 0.
Ppt slides
Ppt slides
All ppt slides
Now we start processing this node, and to do that, we process the edges outgoing from S in some
order. It doesn't mean which is the order. It can be just the order in which they were saved in our
data structure for the graph. So we chose our first edge, and we discover node to the right. And we
know that the distance to this node is 1 because this is a layer 1 node we discovered. And instead of
starting to process it, we go to the next edge from S and we discover another node of layer 1 and
assign the distance to 1, and again, and again, again, and again. And so now, we have discovered all
the nodes from the layer 1. And we put them all in the queue in the order starting from the node to
the right and counterclockwise order.
Play video starting at 2 minutes 34 seconds and follow transcript2:34
So now we've processed our node S and we need to process something else. What to process? The
node which is the first in the queue, and this is the node to the right from S. So we start processing
this node. And to go through the edges from this node in the order they were saved. I don't know that
order, so let's see. Okay, so the first edge outgoing from this node was to another node from the first
layer. So it is a red edge because it goes to the node which has already been discovered. We don't do
anything with this edge. Then the next edge goes to the right, and we discover a node from layer 2,
which is to the right, and we set its distance to 2. Then the edge goes to the node right and up, and
we also assign distance to this node to 2. And the last edge goes to the node to right and down from
our node. And it also gets distance 2 because this is a layer 2 node.
Play video starting at 3 minutes 37 seconds and follow transcript3:37
And what we do next? We've processed our node to the right from S and we need to process
something else. And this something else is the next node from the layer 1 in the counterclockwise
order. This is this node to the right and up from node S. We'll start processing it. In some order we'll
process edges from it. So the first edge is a red edge to the left, and we don't do anything with it. And
the second edge is a red edge to the node to the right and up from it, which is familiar too, but it has
also been already discovered. And then the next edge gives us new nodes from layer 2. And the next
one also gives us a new node from layer 2.
Play video starting at 4 minutes 19 seconds and follow transcript4:19
So we finished processing this node and we go to the next one. Again, red edge, red edge, new node
discovered, new node discovered, next node from layer one. Our red edge, our red edge, a new node
from layer 2, and new node. And again, with another node from layer 1 and the last node from layer
1, we process it. Okay, so we've processed all the nodes from layer 1. We've discovered all the nodes
from layer 2 in some order. I don't even remember the order of those nodes. It is mostly
counterclockwise but not completely. So let's see In the slides, what is the correct order? So the first
one to be discovered was the node to the right. And start processing it, and there is a red edge from
it, and another one. And there are no more edges from this node so we'll start processing the next
one in the counterclockwise order. And there is a red edge from it, and it looks like there are no more
edges from it, so we finished processing this node. And then we go to the node to the right and down,
and there is a red edge from it, and that's all. And so, we go through the nodes in the second layer
and all the edges from them are obviously red because we've discovered everything that's connected
to S already. So we go and test that all the edges are red, and now we've finished, and again we have
the graph which is layered. Node S is in the layer 0, 6 nodes in the layer 1, and 12 nodes in the layer 2.
And if we have some node which was not connected to S, then it stays with the distance estimate of
infinity. So this is how breadth first search actually works, and in the next video we will discuss the
pseudocode that actually implements this algorithm.

Implementation and Analysis


Hi. In this video, we will implement the Breadth-first Search algorithm from the previous video and
analyze it's running time.
Play video starting at 7 seconds and follow transcript0:07
Let's look at this pseudocode. The procedure of BFS is called after Breadth-first Search and it takes
graph G and origin node S as input parameters. It also uses array dist to store the distances from the
original node S to all nodes in the graph. It can be not an array, it can be a map from nodes to
distances, depending on what are your nodes. If your nodes are numbered from 0 to n minus 1, then
it will probably be convenient to use an array of size n to store distances for those nodes. But if your
nodes are labeled with some strings or some other objects, then it maybe wise to use a map from
nodes to distances and call this data structure dist. Anyway, we'll use this dist data structure to store
our estimations of distances from origin to all the nodes in the graph. And we initialize all these
distances with infinity, with the exception of the node as itself, which gets an estimation of 0 from the
beginning. And we will also use another data structure Q, which is a queue, the data structure which
works in the principle of first in, first out. So the first element that goes into the queue is the first
element that goes out of the queue. And we initialize this Q with just one element, the origin node S.
And this symbolizes that we already discovered node S and put it in the queue. So all the discovered
nodes are those which are in the queue. All of the nodes which haven't been in the queue yet are
white nodes which are not discovered yet. And the black nodes, in the terms of the example I showed
you before, are those nodes which are already out from the queue and are being processed or have
been processed before. And we will take nodes from Q one by one, process them, discover new
nodes and put new discovered nodes back into the queue. So we'll start with Q initialized with only
starting node S. And while this Q is not empty, we take the first element from it using method
dequeue. So we get it to variable u. So, this is the first node in the queue, and we start processing it.
And to start processing it, we triggers all the edges outgoing from this node u in the graph. So we
have a for loop for all edges (u,v) in the set of edges of the graph and this means that we traverse all
the edges which have starting nodes u and some other node as it's end.
Play video starting at 3 minutes 9 seconds and follow transcript3:09
And we'll look at node v and now we need to determine whether this node is already discovered and
maybe already processed or not. And to determine that, we'll use our dist values. So if dist value of
the node is infinity, then it means that this node hasn't been discovered yet, because as soon as it will
be discovered, we will change the estimation of the distance to it and it will becomes finite. While it is
still infinite, it means that the node hasn't been discovered. And vice versa, if dist Is finite, then we
have discovered this node already. So if this node was discovered previously, then we don't need to
do anything with it. This is a red edge. And we don't do anything with a red edge from our currently
processing node to the node which was discovered earlier. But if the edge is to a white node which
was not discussed previously, we need to process this edge. And by processing it we first discover the
end of this edge v and we do that by calling enqueue, so adding these v to the end of the queue. And
we'll also change the estimate of the distance to this node v and we set it to distance to the current
node plus one. Because we know that when we process a node from somewhere, some of the edges
are red because they go to the same layer of or one of the previous layers. But the edges that go to
the undiscovered nodes go to the next layer. So the distance to v is equal to distance to u plus 1. And
we repeat and repeat this process while our queue is not empty.
Play video starting at 4 minutes 55 seconds and follow transcript4:55
And we need to make a few notes. First, this infinity thing. Of course, in a real programming language,
there won't be any infinity, so you will have to use something special for that. One variant is to
estimate what is the maximum possible distance from origin node to any nodes in the graph, and just
use a constant value or some computable value which is definitely bigger than that. For example,
distance from origin cannot be bigger than the number of nodes in the graph, because every path
without cycles inside it will have, at most, a number of nodes minus 1 edges. Or you can just say that
it is not bigger than the number of edges in the graph. Because you won't use the same edge twice in
the shortest path from origin node to your node. So anyway, [COUGH] you can assign infinity to
number of nodes plus 1 or number of edges plus 1 or some other big value. And another thing you
can do is you just, instead of using integer numbers for storing distances, you create a special
structure which has two fields. One of them is distance if is determined. And another field is just a
boolean field, which tells you whether the distance is defined or it is still not defined, and then this
means infinity.
Play video starting at 6 minutes 23 seconds and follow transcript6:23
And another note is that why this algorithm even stops. And the key observation is that we only put a
node into the queue once, because as soon as we put this node in the queue, we enqueue it, we also
change the estimation of distance to this node and it becomes finite. And so one, we will try to pull it
in the queue next time, we will compare the distance to it with infinity and it won't be equal to
infinity. So we won't put node into the queue again. And it means that we put at most number of
nodes in the graph, elements in the queue.
Play video starting at 7 minutes 3 seconds and follow transcript7:03
And on each step of the while loop, we take some node out of this queue, so its size decreases. So it
can increase at most number of nodes in the graph times and it decreases in each steps. So, this
means that these others definitely stops and it definitely stops after at most number of nodes in the
graph steps of the external while loop.

Now let's estimate the running time of this algorithm more precisely. I state that the running time of
breadth-first search is proportional to the number of edges plus number of nodes in the graph. Why is
that? First we've already discovered that each vertex, each node of the graph, is enqueued at most
once. So the number of iterations of the external while loop is at most number of nodes of the graph.
Another observation is that each edge is examined either once for directed graphs, or twice for
undirected graphs. Why is that? Well, the edge is examined when one of its ends is processed. The
node, which is one of its ends, is processed. And if the graph is directed, then the edge is actually
examined only if it's start is processed. And if the graph is undirected, then for each of its nodes, the
edge is examined when this end is processed. Of course, this edge will work and discover a new node
at most once out of those two times it is examined. But still it will be examined twice from both ends.
Play video starting at 8 minutes 41 seconds and follow transcript8:41
And of course, if the edge is not connected to the original node, it won't be examined at all. But we
say that each edge in any case will be processed at most twice. And what it means is that the total
number of iterations of the internal for loop is, at most, number of edges of the graph. And so adding
those up and adding the constant number of operations in the start and adding the initialization of
the dist values which goes in time proportional to the number of nodes of the graph, in total, we got
number of edges plus number of nodes in the graph. And in the next video we will prove that
breadth-first search actually returns correct distances to all the nodes from the origin node.

Proof of Correctness
Hi. 
In this video, 
we will prove the correctness of the breadth-first search algorithm for 
finding distances from an original node to all the nodes in the graph. 
We will also prove some properties of this algorithm, which can be useful when you 
want to extend this algorithm to other kinds of problems.
Play video starting at 15 seconds and follow transcript0:15
First, recall that node u is called reachable from node S, 
if there is a path from S to u. 
And the lemma states that the reachable nodes are discovered during breadth-first 
search. 
And they get a finite estimate of distance from S. 
And unreachable nodes are not discovered during breadth-first search. 
And they stay with infinite distance estimate, infinite distance value.
Play video starting at 39 seconds and follow transcript0:39
First, let's prove it for reachable nodes. 
So, suppose for the sake of contradiction that some reachable nodes were 
not discovered during the breadth-first search. 
Then select out of those nodes the one closest to S in 
terms of the length of the path. 
So, let's assume that u is the closest to S unreachable node, 
which was not discovered during the breadth-first search.
Then take the shortest path, some shortest path from S to u. It goes from S to some nodes of v1 from
there to v2 and so on and up to vk and from vk it goes to u.
Play video starting at 1 minute 18 seconds and follow transcript1:18
Then u will be discovered, actually, while processing vk. Why is that? Well, because first, we will
discover and process S. Then we will discover and process v1. Then we will discover and process v2
and so on. And we will go up to vk. And when we process vk, u will be discovered. So this is a
contradiction with the assumption we made for the sake of contradiction that some reachable node is
not discovered during breadth-first search. So we proved that reachable nodes are discovered. And of
course, they will find its estimation of distance. Because this is how the algorithm works. As soon as
the node is discovered, it gets estimation of distance, bigger by one, than the estimation for the node
b process.

Ppt slides
Now let's prove this statement about unreachable nodes. Let's suppose, again, for the sake of
contradiction that some unreachable nodes were discovered. And let u be the first such unreachable
node to be discovered.
Play video starting at 2 minutes 19 seconds and follow transcript2:19
Now let's see when it was discovered. It was discovered while processing some other node v.
Play video starting at 2 minutes 24 seconds and follow transcript2:24
And as u is the first unreachable node that was discovered and v was discovered before u, because u
is discovered when v is already processed, it means that v is a reachable node. And it means that
there is a path from s to v, but then it means that there is a path from s to u through v. So u is actually
reachable, and this is a contradiction, and so we proved that unreachable nodes are not discovered
during the breadth-first search. So it works correctly at least in the sense that it finds some distances
to reachable nodes and doesn't find any finite distances to unreachable notes.
Now we will prove the order lemma, which states something about the order in which nodes are
discovered and dequeued from the queue. It says that by the time some node u at distance d from
the original node is dequeued, it started processing. All the nodes at distance at most d have already
been discovered. So they have already been enqueued in the queue. And maybe some of them have
already been processed. Some of them are still in the queue, but at least, they have already been
discovered.
Ppt slides
Ppt slides

What are all the vertices in this graph which can be undiscovered by the breadth-first search
starting from vertex SS by the time vertex BB is dequeued?
Play video starting at 3 minutes 40 seconds and follow transcript3:40
So let's prove this again by contradiction. Suppose that this lemma is not true and considered the first
time this order was broken, so that some node u at distance d has already started processing, and
that's why it is filled with black. And some other node v at distance d', which is at most d has not yet
been discovered. Let's suppose that. So we know that d' is at most d, and we know that node u was
discovered while processing some other node u'.
Play video starting at 4 minutes 21 seconds and follow transcript4:21
And we know that the distance to this node u' is at least d-1, because if the distance to u' was less
than d-1, then the distance to u would be less than d because there's an edge from u' to u.
Play video starting at 4 minutes 37 seconds and follow transcript4:37
Also, we know that the node v has an edge from some node v' with distance d'-1 from s, because
there is some shortest path from s to v.
Play video starting at 4 minutes 53 seconds and follow transcript4:53
And there is the previous node before v on this path, and this is v'. And distance to v' is exactly d'-1,
because this is a shortest path. So, it can be less than d'-1 because otherwise, distance to v will be less
than d'. And it cannot be bigger than d'-1 because then the shortest path to v would be longer than d'.
Play video starting at 5 minutes 17 seconds and follow transcript5:17
So, distance from s to d' is exactly d'-1. And we know that the prime is at most d, and from that, we
know that d'-1 is at most d-1. And it means that v' was discovered before u' was dequeued because
we know that the first time that order lemma was wrong was with nodes u and v. And now u' and v'
were before that. So that point of time order lemma still works. So we know that v' was discovered
before u' was dequeued. So it means that v' was gray before u' became black. [COUGH] And this
means that also, v' became gray before u became gray, because u became gray while processing u'.
So v' was enqueued and filled with gray before u was enqueued, discovered, and filled with gray.
What that means is that because our queue works as first in first out, and v' was discovered before u,
it means that v' was also started processing before u. So v' was dequeued and filled with black before
u was dequeued, and immediately after v' was dequeued, v would be discovered, because there is an
edge from v' to v. So either v was discovered even before that, or it was discovered during processing
v'. And we see that v is already discovered, and u is not yet dequeued. So it is a contradiction with the
fact that when u is already black, v was still white and not discovered. So we proved our order lemma
by contradiction.

Proof of Correctness (continued)


Now the main result is that one node u is discovered during breadth-first search. 
The dist value, the estimate of distance to this node from origin, 
is assigned exactly the correct distance from node S to node u. 
Let's prove this.
Play video starting at 20 seconds and follow transcript0:20
To prove it, we'll use mathematical induction, and as a base case, 
we see that one node S is discovered. 
This is the first node to be discovered. 
The dist value is assigned to 0, and 
this is actually the correct distance from S to itself. 
So we'll use induction on the distance to the node. 
So, the inductive step is that suppose we proved our statement about 
correct distances for all nodes, which are distance at most k from the origin. 
And now we'll prove it for nodes at distance exactly k + 1. 
If we do that, we'll prove the lemma itself.

So now, taken out v at distance k + 1 from the origin, we know that for all nodes which are closer, the
correct distances are found during breadth-first search. Now let's prove it for this particular node v.
Play video starting at 1 minute 16 seconds and follow transcript1:16
So we know that v was discovered, because it is reachable. And we proved that all reachable nodes
were discovered during breadth-first research. So it was discovered while processing some other
node u. Now let's estimate the distance to u. From one point of view, we know that the distance from
S to v is at most distance from S to u plus 1, plus this edge from u to v through which v was
discovered. And we know the distance from S to v is exactly k + 1, and that means the distance from S
to u is at least k.
Play video starting at 1 minute 47 seconds and follow transcript1:47
From the other hand, we know that v is discovered only after u is dequeued. And using the order
lemma, we can state that the distance to u is strictly less than distance to v, because otherwise v
would be discovered before u is dequeued. And so distance from S to u is strictly less than k + 1. And
we already know this at least k. And we also know this distance is integer number. And so the only
option is that distance from S to u is exactly k. And then see what happens when we assign this value
for v. We assign it to this value of u + 1, which is k + 1. Which is the same as the distance from original
to v. So we proved our lemma by induction that when the node is discovered at that point, it is
assigned correct distance estimate and it is saved in this value.
And the last property we want to prove, just to understand better how breadth-first search works and
to apply it to some nonstandard situation, is that the queue which we use in the breath-first search
looks like this. It first has some nodes of distance d for some d, and maybe in the end it has some
nodes of distance d + 1. But it doesn't contain any other distances. If the first node in the queue has
distance d, then there are no nodes in the queue with distance less than d, there are no nodes with
distance more than d + 1. And maybe there are some nodes at distance exactly d + 1, but they all go
after all the nodes at distance d in the queue. Let's prove that.
Play video starting at 3 minutes 28 seconds and follow transcript3:28
So first, we know that by order lemma, all nodes at distance d were enqueued before first such node
is dequeued. And nodes at distance d + 1 are only found when a node at distance d is dequeued in
process. So, this means that nodes at distance d were enqueued before nodes at distance d + 1 were
dequeued. Also, we know the same thing for nodes at distance d- 1, so they were all enqueued before
nodes at distance d. So by the time node at distance d is dequeued, all the nodes at distance d- 1 are
already dequeued from the queue. So there are no more nodes in the queue at distance d- 1 or less.
Play video starting at 4 minutes 10 seconds and follow transcript4:10
And regarding the nodes at distance more than d + 1, they will be discovered only when we start
dequeuing nodes at distance d + 1 and more. But those nodes are all going after nodes at distance d.
So at the point when the first node in the queue has distance d, no nodes at distance more than d + 1
can be in the queue, because no nodes at distance d + 1 were dequeued and so we couldn't put
anything at distance more than d + 1 in the queue.
Play video starting at 4 minutes 42 seconds and follow transcript4:42
So we proved this property. Also, and now we have to prove that our algorithm finds correct
distances to all the reachable nodes in the graph, that it finds infinite distances to the unreachable
nodes in the graph, and we also know the structure of the queue at any given moment in time. And in
the next lecture, we will also learn how to actually, not only find distance, but also reconstruct the
shortest path from the origin to the node we need. Because we don't want to just know that the
distance in terms of number of flight segments from Moscow to San Diego is two, we also want to
find are those two flight segments to actually get from Moscow to San Diego. So, see that in the next
video.

Shortest-Path Tree
Hi, in this video you will learn what is a shortest path tree, how to use it to reconstruct the shortest
path from original to the node you needed shortest path to after finding distances with breadth-first
search. And we'll need to slightly modify breadth-first search procedure for that.
Play video starting at 21 seconds and follow transcript0:21
What is the shortest-path tree? On the left we some undirected graph with a nine nodes, and suppose
we selected nodes as the origin. Then on the right we see the layered structure of this graph, where
as in the layer zero. Four nodes in the layer one, two nodes in the layer two and two nodes in the
layer three. So, we know the distances to those nodes but we also can remember is how did we come
to this or that particular node during the breath's first search. So, for example, we example, we came
to the node a during breath's first search, directly from s. We draw a directed edge from A to S. Saying
that S is the note from which we came to A when we discovered it. We also draw an edge from B to A
saying that in our breakfast search algorithm we discovered B from A. So we can draw such an edge
for all the nodes but for S itself, and we will get some graph. And we'll prove later that it is a tree, but
to make a tree, we also need to make it undirected, so we just remove all the arrows from the edges.
And this already will be a tree.

Play video starting at 1 minute 44 seconds and follow transcript1:44


So the lemma states that shortest-path tree, as we define it now, is indeed a tree. So that it doesn't
contain any cycles. Because the fact that this is a connected component is by construction. We only
add those nodes in the tree, which we reached during breath for search. And always connect a new
note we just discovered to some note, which was already in the tree, so the graph is obviously
connected. We're going to need to prove that there are no cycles in this graph,
Play video starting at 2 minutes 17 seconds and follow transcript2:17
let's prove that as usual by contradiction. So, suppose there is a cycle in the shortest path three. And
suppose this is a cycle off length 5, A-B-C-D-E. Now this is an undirected cycle but
Play video starting at 2 minutes 32 seconds and follow transcript2:32
the edges of the cycle have to be ordered somehow initially. So, for example, the edge between A and
B can be either from A to B or B to A but it doesn't really matter. Let's assume that. Go from a to b
without loss of generality. Now what we know is in the shortest path tree, at most one outgoing edge
from each node because for each node other than s, we just saved from which node did we
discovered it. There is only one such node from which this is discovered. There at most one outgoing
edge from each node, and if we look at the edge between A and E, we know that there is an outgoing
edge from A already. The edge between A and E cannot be an outgoing edge from A, because. A
would have two outgoing edges. So the only way this can work is that there is an outgoing edge from
E to A.
Play video starting at 3 minutes 30 seconds and follow transcript3:30
And similarly, the edge between D and E can only be attached from D to E, and the same with edge
CD and the same with edge BC. So now we see the directitude So looks like it can work but
unfortunately it can not. So lets look at the distance from S to A.
Play video starting at 3 minutes 53 seconds and follow transcript3:53
We know that when we go by the direct edge of a shortest path graph we go from node which was
discovered from some node To the nodes from which it was discovered. And we know that the
distance to the newly discovered node is assigned to distance of the nodes from which it was
discovered plus one. So if we go by this edge to the parent in the shortest path tree, we decrease our
distance from S by exactly one. We begin We stay in the node which is closer to S, exactly by 1. So if
we go from A to B, the distance to S decreases by 1. If we go from B to C, decrease by 1. So we start
with some distance D from S to A, and then when we go by edge A B, we stay at node B. And we know
that the distance from S to B is at most D minus 1. And then the distance to C is at most d minus two,
and the distance to D is at most d minus three, and d minus four for E, and now, we make the
conclusion that the distance to A is at most d minus five. So, d is at most d minus 5 which is a
contradiction, which cannot happen so, By contradiction we'll prove that they're cannot be any cycles
in the shortest path tree indeed.

Play video starting at 5 minutes 9 seconds and follow transcript5:09


Now how to construct the shortest path tree? We've defined it in such a way that it is very easy to do.
We only need to add two statements to the code of the BFS procedure. We need another array or
map, which is called prev so in prev we will store for each node the previous node from which it was
discovered. And initialize it with a pointer to no work with a special pointer needle, which basically
means we don't have any previous node. And when we discover a node in the end of the procedure.
We not only have data distance to those node, but we also save the node from which it was
discovered. Now this is the only moment when berev changes because the next time we won't
discover this node again and we won't change its berev. So that's the whole code for constructing
shortest-path tree. Now for every node we know what is its parent in the shortest-path tree.

Reconstructing the Shortest Path


How do we reconstruct the shortest path given the shortest path tree? So, the nice property of the
shortest path tree is that it somehow contains all of the shortest paths from the origin to all the
nodes. Why is that? Because if we go by edge of the shortest path tree, we decrease the distance
exactly by what? So if we start with some node, and go by edges of shortest path tree by directed
edges one by one. We will each time decrease the distance by one. And so it means that if the
distance to the node was D, then exactly after D steps, we will end up in the origin node. Because we
will be at distance zero. So, we will make D steps. We will have a path of length D, where D is the
shortest path length from S to this node. So we'll have the shortest path itself because its length is the
same as distance to this node. So what do we need to do in the code? We write down the procedure
reconstruct path, which takes in the origin node as the node U for which we need to construct the
path, and the prev, which BFS procedure builds for us. And the results variable will store the path
itself. So we will start with an empty path. And then, we'll go back from node U until we come to node
S. So we start with our node U, and while it is not yet equal to S. We append it to the shortest path,
and then we go by the edge of the shortest path tree by assigning prev of U to U. And the we repeat,
repeat, repeat it until we come to node S. And U will be equal to S, and our while loop will start.
Play video starting at 1 minute 50 seconds and follow transcript1:50
By this time, we have the shortest path from S to U in the result variable, but for one thing. It is in the
reverse order so we have first the And note which is U. Then we have the prev of U, then prev of prev
of U, and so on up to S. So the path is in the reverse order. So to return the actual shortest path from
origin to node U. We need to reverse this path and then, we return the result. So of course this
procedure works fast, it only needs amount of steps which is equal to the distance from node S to
node U. Which is definitely less than or equal to number of nodes in the graph for example. But in
most cases, it will be even less than that.
So in conclusion, we can now find the minimum number of flight segments to get from one city to
another, if we have the graph of cities and available flights between them. We can also not only find
this number of flights. But we can actually reconstruct the optimal path between some city and
another city. For example, you can go from Moscow to San Diego in the minimum number of flight
segments. And we also can build the tree of all shortest paths from one origin. So we not only can
build a shortest path from one node to another, but we actually can build all the shortest paths from
one node to all the nodes in the graph. And all this works in time for personal to number of edges plus
number of nodes in the graph.

Slides and External References

Slides
10_shortest_paths_in_graphs_1_bfs.pdfPDF File

Reading
Sections 4.1 and 4.2 in [DPV]

If you find this lesson difficult to follow


Section on breadth-first search at Algorithms class by Tom Cormen and Devin Balkcom at Khan
Academy

Visualizations
 Breadth-first search by David Galles

References
[DPV] Sanjoy Dasgupta, Christos Papadimitriou, and Umesh Vazirani. Algorithms (1st Edition).
McGraw-Hill Higher Education. 2008.

Programming Assignment: Programming Assignment 3:


Paths in Graphs
You have not submitted. You must earn 1/2 points to pass.

Deadline Pass this


assignmen
t by Jul 26,
11:59 PM
PDT
Week 4
Algorithms on Graphs
Week 4
Discuss this week's modules here.

47 threads · Last post 4 hours ago


Go to forum
Paths in Graphs 2

This week we continue to study Shortest Paths in Graphs. You will learn Dijkstra's Algorithm which can be
applied to find the shortest route home from work. You will also learn Bellman-Ford's algorithm which can
unexpectedly be applied to choose the optimal way of exchanging currencies. By the end you will be able
to find shortest paths efficiently in any Graph.
Less
Key Concepts
 Explain algorithms for finding shortest paths in weighted graphs
 Create a program for finding a cheapest flight
 Create a program for detecting anomalies in currency exchange rates

Less

Fastest Route

Video: LectureFastest Route

6 min

Resume

. Click to resume

Video: LectureNaive Algorithm

10 min

Video: LectureDijkstra's Algorithm: Intuition and Example

7 min

Video: LectureDijkstra's Algorithm: Implementation

3 min

Video: LectureDijkstra's Algorithm: Proof of Correctness

4 min

Video: LectureDijkstra's Algorithm: Running Time

7 min

Reading: Slides and External References

10 min

Currency Exchange

Video: LectureCurrency Exchange

6 min

Video: LectureCurrency Exchange: Reduction to Shortest Paths

8 min

Video: LectureBellman-Ford Algorithm

6 min

Video: LectureBellman-Ford Algorithm: Proof of Correctness

6 min

Video: LectureNegative Cycles

7 min

Video: LectureInfinite Arbitrage
10 min

Reading: Slides and External References

10 min

Programming Assignment

Purchase a subscription to unlock this item.

Programming Assignment: Programming Assignment 4: Paths in Graphs

3h

Fastest Route
Hi, in this lecture, you will learn the algorithm for finding fastest routes, which is used, for example,
when you open your navigation app to get home from work faster or by logistics companies when
they want to deliver the goods to their customers in time, but use less cars and less people.
Play video starting at 20 seconds and follow transcript0:20
So let's first state the problem. It's really easy to do. For example, what is the fastest route to get
home from work, or more generally, what is the fastest route from point A to point B right now? And
below, you see a screenshot from a popular navigation app, Yonix Navigator, which is given a task of
getting the fastest route from the point you are currently in, which is marked by a circle with the
letter y in it, to the finish, which is your destination. And it suggests to you a few variants. So the
default suggestion is the fastest route, but there are two others. And there may be, for example, a
route which uses less turns or the route which is easiest for a novice driver. And sometimes, there can
be a shorter route, which is not the fastest one, but shortest one in terms of distance. So in this
problem, we can represent the city or the country as a graph, where nodes are some positions in the
country. For example, the crossroads and the edges are the roads that connect those crossroads. And
there are some weights on those edges and the weight on an edge is, in this case, the time in seconds
or in minutes you need to get from start of the edge to the end of the edge. And what you need to
find is the shortest path in this graph, but not shortest path in the graph in terms of the number of
edges,
Play video starting at 1 minute 56 seconds and follow transcript1:56
shortest path in terms of the sum of the weights of those edges. So if you want to get from A to B in
the smallest amount of time possible, then you get those by sum paths with sum edges and sum
weights on those edges corresponding to the time. And you just sum up those weights of the edges,
and you get the total time it will take you to get from A to B. You want the fastest route? So you need
the shortest path in terms of the sum of the weights of edges on this path.
Play video starting at 2 minutes 27 seconds and follow transcript2:27
Let's see now, does the breadth-first search algorithm from the previous lesson maybe help us? So
let's look at this graph, here, we see that node A has a direct edge from A to point B, where you need
to get. And so, what our breadth-first search would say is that you already know the optimal path
from A to B because there is a direct edge, and so there is no point in getting to some other nodes,
just go directly from A to B, right?
Play video starting at 2 minutes 57 seconds and follow transcript2:57
Well, this doesn't work, for example, in this case, let's suppose that it takes 5 hours to get from A to B,
and the corresponding weight of the edge from A to B is 5. And let's suppose there is another node, C,
such that there is an edge from A to C, which takes 2 hours to go through it. And another edge from C
to B, which also takes 2 hours to get through it. And of course, this looks a little bit strange, but if
there is a traffic jam on the road from A to B, and roads from A to C and from C to B are free roads,
where there is no traffic jam, then this can happen. And so, then we see that going from A to B
through C will take just 4 hours, while going directly from A to B takes 5 hours, and so our breadth-
first search algorithm doesn't work.
Play video starting at 3 minutes 50 seconds and follow transcript3:50
So maybe it's vice-versa, maybe it's always better to work around, to go around some node, not to go
directly. But that's also not the case, because if the edge from A to B is 3 hours, and edges from A to C
and from C to B didn't change, and we just removed the traffic jam from A to B and it became 3 hours
instead of 5, it is now better to go directly from A to B. So there is now universal rule to go directly or
not to go directly. We need something more clever than that to solve our shortest or fastest through
the problem.
Now let's gain some intuition about this problem. So let's look at this graph below. And assume that
we stay at origin node S. And we only know that the edge from S to B is going to take 3 hours and
from S to C is going to take 5 hours. So can we be sure that the distance from S to C is equal to 5, that
it will take 5 hours for the optimal route from S to C, okay?
Play video starting at 5 minutes 1 second and follow transcript5:01
So no, this is not the case, because for example, the edge from B to C can have weight 1, and so, the
shortest route from S to C will be from S to B and from B to C, which is only 4 hours instead of 5 hours
if we go directly. So we cannot be sure that the fastest route from S to C is 5 hours.
Now, another question is can we be sure that the distance from S to B is equal to 3?
Play video starting at 5 minutes 32 seconds and follow transcript5:32
And in this case, the answer is yes, because if we don't go directly from S to B, what other options do
we have? We can go only from S to C and then go by some other edges to come to B, but if we go
from S to C, we spent already 5 hours. And after that, we also spend some time to get to B. And the
time is now negative. So we cannot spend less than 3 hours time in the end. We actually cannot
spend less than 5 hours in the end. So going from S to B directly by 3 hours is beneficial for us. In
terms of the graph and the weights of the edges in it, it means that there are no negative weights on
the edges. All the weights are non-negative numbers. And so, we cannot decrease the length of the
path by going through them. So if we have already a path of length 3 and all other paths start from
spending more time, we cannot improve this direct buff. And in the next video, we'll use this idea to
create a negative algorithm that solves our fastest route problem.

Naive Algorithm
Hi. 
In this video, we'll solve the fastest route problem using the idea 
from the previous video, but we will do it using a naive 
algorithm in which we will need to improve in the next video.
Play video starting at 13 seconds and follow transcript0:13
First, let's make an observation that if we have some optimal path, 
some fastest route or some shortest route. 
Than any part of it between some note in the middle and 
some other note in the middle. 
It's also optimal in the same sense. 
Let's prove it.
Play video starting at 32 seconds and follow transcript0:32
Consider an optimal path from some origin S to some note T and 
consider some two vertices u and v on this path. 
They can coincide with s or t or they can be somewhere in the middle. 
Suppose there was some shorter path from u to v which is marked with green. 
Then we would be able to go from s to u as there is no path. 
Take a shorter path from u to v, take a shortcut and 
then go where there is no path from v to t. 
And in total, 
this new path will be shorter than the initial path, which was optimal. 
So this cannot happen. 
And so we know that actually any part of an optimal path is also optimal itself.
Corollary from that is that if there is some shortest path from S two nodes and u is the previous node
on that path. Then, distance from the origin to the final destination, t, is equal to the distance from
origin to node u, which is the previous node, plus wight of the edge between u and t. And we will use
this property to improve our estimates of distances from origin to some nodes gradient.

Play video starting at 1 minute 56 seconds and follow transcript1:56


You remember that in the breath research algorithm we had an area or map called dist, and we use
these dist values to store. Our estimation from the distance to the region to this particular now.
Play video starting at 2 minutes 9 seconds and follow transcript2:09
In the reference research we found distance of started from Infinity. As soon as they get update they
became correct distances formal region to the corresponding note. This will not be the case in
algorithm. These distances will change Several times maybe before they become correct. But in the
end they will become correct. And during the process these distances will be upper bounds on the
actual distance. So they will be more than or equal to the real distance from origin to the
corresponding node. [INAUDIBLE] the procedure called Edge relaxation, and it means the following.
We take the edge U, V. We check, is it better to go to V, through the optimal currently known path
from S to U, and then following the edge from U, to V. Does it improve our current upper ont he
distance to v, or not? So, we have some upper bound dist value, dist of v, and we achieve that, maybe
using u on the way or not. We might have come to v in a different way, and we might have come
using the same distance as if we go from s to u, and then follow huv. Or we could use longer path. So
we just check whether it is possible to improve our current estimation of distance of feet. We are
using distance of U plus weight of the edge to U.
So here is the code of relaxation procedure. It takes one edge from U to V.
Play video starting at 3 minutes 48 seconds and follow transcript3:48
As input and it checks if the current distance estimate of v is bigger than the current distance estimate
of u plus weight of the edge. That means that definitely we can come to know u from s with a path
with a length at most dist of u. And if we then follow edge uv with a weight w of uv we will improve
the distance estimate of v. So in this case we update this distance estimate we decrease always as you
see And we also remember that the node from which we came into v is now u. Remember that in the
breadth for search algorithm, we start in pref the node from which this node was discovered. In this
case, we use the same data structure to the store the node from where we've updated the distance
to our node last time.
Now, we're ready to suggest a naive algorithm to solve our fastest route problem. Procedure naive
takes graph g and origin node S as inputs.
Play video starting at 4 minutes 59 seconds and follow transcript4:59
It uses dist values and prev values the same as in the breth first search, and we initialize these values
with infinity as dist[u]. And the prev[u] value with point resting no where, we also initialize the dist of
the original nod withe zero. And then, the only thing we do we relax all the edges. More specifically,
on each iteration of the do while loop, we try to relax each of the edges in the graph. And if at least
one is effective, that is, some dist value changes, then we continue to the next iteration. And we only
stop When the whole iteration, after going through all the edges in the graph, couldn't relax anything.
Play video starting at 5 minutes 44 seconds and follow transcript5:44
And we state that this Naive algorithm works, basically that it stops and when it stops it finds correct
distances to all the nodes.
Play video starting at 5 minutes 57 seconds and follow transcript5:57
To prove that assume for the sake of contradiction, that at some point no edge can be relaxed and
there is a vertex v such that the distance to the vertex is bigger than the actual distance from origin to
the note. We know that this estimation of distance cannot become less than the actual distance
because this is always an upper bound. It starts from infinity and it is only decreased when we find a
better path from origin to this node. So there is some path from origin to this node of length exactly
dist[v], so it cannot be less than the actual distance from origin to this node. And this also means that
there can actually be no such situation that we do relaxations [INAUDIBLE] and we do many iterations
and we don't stop at any point. Because after any successful relaxation, some distance estimation is
decreased by at least one. And if we started with infinity then the value just becomes finite, but that
can happen at most a number of nodes times. And if the value was already finite it just decreases by
at least one. And if we started with some number of finite values, which are bigger than the actual
distances and that each iteration we decrease at least one distance by at least one. This process
cannot be infinite. It will at some point come to the stage when the distance, and the distance
estimate are the same, and so this edge cannot be relaxed anymore. And if that happens for all the
edges our algorithm will stop. So our algorithm definitely stops. The question is whether it comes to
exactly the same dist values as distances from origin To the corresponding nodes. So for contradiction
we assume that it does not. And for that for at least some node v dist value is bigger than the actual
distance from our agent. And then, we consider some shortest path from S to this node v.
Play video starting at 8 minutes 13 seconds and follow transcript8:13
V definitely is a broken note, in the sense that [INAUDIBLE] is bigger than the correct distance, but
there can be some other notes on this path from S to V, which have the same property, which are
broken. U, V the first note of counting from S on this path, which is broken in some sense. U is
definitely not the same as S, because for S we know that the [INAUDIBLE] value is zero, and the
correct distance is zero. U is at least the second note on the path, or maybe much later then. There is
a previous note on the path before U. And what is denoted by p.

And let's look at S, p, and u. What we know is that p is not a broken node, and so its dist value is the
same as the distance from origin to this node p. And so we know that the distance from S to u Is
actually equal to the distance from S to p plus weight of the edge from p to u. Why is that? Because
the part of path from S to u is optimal because it is a part of an optimal path from S to v. And also part
of path from S to p is also optimal, and so this equality It's true that distance from S to u is equal to
distance from S to P. And then add the weight of the S from p to u, but we also know that distance
from S to p is equal to this value of p. And so, the second quality is true that it is equal to dist value of
p plus weight of the S from p to u.
Play video starting at 9 minutes 52 seconds and follow transcript9:52
But we also know that note u is broken and the dist value of u is strictly bigger than the correct
distance from s to u.
Play video starting at 10 minutes 2 seconds and follow transcript10:02
Is equal to dist value of p plus weight of the edge from p to u. But this inequality is exactly the
property we checked to determine whether an edge from b to u can be relaxed or not. And so, from
one hand we know that this edge can be relaxed and from the another we know that we cannot relax
anymore edges in our graph and our nef algorithms stopped and this is a contradiction. So now, we
proved by contradiction that our nef algorithm returns correct distances from orignal to. So now it's in
the graph. We won't analyze the running time of this algorithm, because in the next video we'll
improve this algorithm and then we'll estimate its running time.

Dijkstra's Algorithm: Intuition and Example


Hi, in this video, you will learn Dijkstra's Algorithm, 
which is an efficient algorithm to find 
shortest paths from original node in the graph to all the nodes in the graph with 
weighted edges where all the weights are non-negative. 
First, let's improve our intuition. 
So let's say we're in the node S, and this is the origin node. 
What we know exactly is that the distance from this node to itself is 0. 
We don't need to see anything else to decide that 
because all the edges are non-negative, so it's better just to stay in s, and 
this is the shortest path from S to itself, okay?
Play video starting at 42 seconds and follow transcript0:42
And now, let's look at all the edges outgoing of node S.
Play video starting at 48 seconds and follow transcript0:48
And let's relax all those edges. 
So now we see that there are edges of length 5 and 
10, and they go to nodes A and B. 
And so, we can determine that the dist value of A is 5, 
and the dist value of B is 10 after relaxation.
Play video starting at 1 minute 8 seconds and follow transcript1:08
Now if there are no more edges outgoing from S, 
we already know that the distance from S to A is exactly 5. 
Because we cannot go around it and spend less time.

Ppt slides very important


. So we're sure about distance for A. So now, let's relax all the edges from A. And we will see that the
distance to B will improve, and the distance to B becomes 5 plus 3, which is just 8. And also, we'll
discover two more nodes, C and D, and we will relax the edges from A to them. And C will get an
estimate of 12 because it's 5 plus 7, and D will get an estimate of 6, which is 5 plus 1, which is the way
of edge from A to D. So the question is what is the next node for which we already know the correct
distance for sure?
Play video starting at 2 minutes 9 seconds and follow transcript2:09
And this node is D. Because it has the smallest dist value estimate, and it means that there is no way
to go around it. We can go directly to it from S through A in distance 6, but if we go through any of
the other nodes, it will take us at least 8 to get there and also some non-negative amount after that.
So we cannot really improve this dist value estimate of 6. So we now know exactly the distance to D is
6, and we can continue by relaxing edges from D.
While for B and C, it is still possible that their dist values are larger than actual distances. For example,
if there is an edge from D to B of length 1, then the dist value of B will improve even better, it will be
7. And if we get an edge from D to C of length 1, then the distance to C will be just 7, also, which is
much less than 12.
Play video starting at 3 minutes 9 seconds and follow transcript3:09
So now we have this intuition that at any moment of time, if we have relaxed all the edges outgoing
from some known set of nodes for which the distances are already known correctly, then the node
with the smallest dist value estimate is also a node for which we know the distance correctly.
So, the main idea of Dijkstra's Algorithm is that we maintain some set R of vertices for which dist
value is already set correctly. We call this set R, known region. And initially, we only have node S, the
region in the set R, because for this node, we definitely know that the distance to it is 0. And then on
each iteration, we take the node outside of this region with a minimum dist value out of all nodes in
this outside region, and we add this node to the known region and relax all the outgoing edges. And
so, after a number of iterations, which is equal to the number of nodes in the graph, we will know all
the correct distances from origin to all the nodes in the graph.
Ppt slides
Let's see how it works in an example. Here, we have six nodes, some edges, and the edge weights are
written nearby the corresponding edges. And on top of each node, and under the bottom nodes, we
have the current dist value. So we start with infinities in all the nodes, but the origin node to the left,
which has dist value of 0. Now we add this 0 node to the known region, and we now know for sure
that the distance from the origin to it is 0. And so we color the dist value by green, because we are
sure that this is the correct distance from origin to this node. And then we color our node in black,
and we start processing it. So we color nodes in black when we start processing them in the same way
we did in the breadth-first search. Now relax all the outgoing edges from this node. First relax the
edge of length 3, and we improve the dist value from infinity to 3. And next, we traverse edge of
length 10 to improve the dist value from infinity to 10. Now what do we know? We know that the
node with a minimum dist value outside of the known black region is the node with dist value of 3. So
this node actually has a distance of 3, and we color the distance with green and the node with black,
and we start processing this node. So we process the outgoing edges. First, edge of length 8, which
goes to the node with distance 10, but new estimate is 3 plus 8, which is 11, which is bigger than 10.
So, nothing changes because of this edge. And I also have an edge of flag 3, and we improve the dist
value from infinity to 6. And there is also an edge of flag 5, which improves the dist value from infinity
to 8. And now, what we see is that the node with the minimum dist values, the node in the top right
corner, which has this value of 6. So we color it with black color and the distance with green and
process it. This edge gives us improvement from infinity to 8. This edge gives us improvement from 8
to 7, and this edge gives us an improvement from 10 to 9. Now the best node is in the bottom to the
right and has distance 7. And it improves the distance estimate for the right most node from 8 to 7,
and notice that there is an edge of weight exactly 0. This is allowed because 0 is non-negative. We'll
only forbid the edges from being negative, but non-negative, including 0, is okay.
Play video starting at 7 minutes 10 seconds and follow transcript7:10
So now, we don't have any more outgoing edges from these nodes to the right and the bottom. And
so it's like the next node process, which is the right most node with an estimation of 7. But we don't
have any outgoing edges from it, so we process the last remaining node, which has distance of 9. And
we can try to relax some edges, but they don't work because this is the farthest node from the start.
So now we know all the distances from the origin to all the nodes in the graph. And this is how
Dijkstra's algorithm works on this example.

Dijkstra's Algorithm: Implementation


Now let's implement this algorithm. So the procedure Dijkstra takes again graph G and origin node S
as input. It will also use dist values and prev values. It will initialize dist values with infinities, but for
the origin node S for which dist value is 0. And it will initialize breadth values with pointers to
nowhere.
Play video starting at 24 seconds and follow transcript0:24
It will use also some data structure which can be an array or a priority queue. It depends on your
choice of data structure. And the writing time will depend heavily on that but any way this is a data
structure, for which we need just three operations. To create it from a set of these values. To extract
the minimum value from it to get the node outside of known origin with a minimum dist value. And,
to change the dist variable of some node which is currently in this beta structure. So we will talk later
about which data structure to choose, but for example, we can use an array here just to create an
array from all the dist values and then work with it. So while this data structure is not empty, we'll
take out of it the node with the minimum dist value, and this data structure will contain only those
nodes which are not in the known region. So it will contain initially node s and all the nodes, but then
after we make the first iteration we will extract node s from this data structure H. Because node s has
dist value of 0 and all other nodes have dist value of infinity, so node s has the minimum dist value. So
we extract minimum from it, which means that we take the node with the minimum dist value and
remove it from the data structure. And then we'll process this node u, basically we take all the
outgoing edges from u in the graph and we try to relax them. To relax them, we check if it's possible
to relax the edge. And if it's possible we update the distance to the end of this edge, the dist value to
the end of this edge. We update the prev value of this node V and most importantly we change the
priority of this node V in our data structure H. What it means is that we improved the dist value for
this node V. And now this node could become potentially the node with the minimum dist value, for
example. So we need to account for that. So you know in data structure we need to do something, for
example if we store everything as an array, we just need to decrease the value of this array
corresponding to node V. And then if we store everything in an array then when we do extract mean,
we just go through all the values of this array and find the minimum value and the node
corresponding to this minimum value.
Play video starting at 3 minutes 14 seconds and follow transcript3:14
So this all happens while our data structure is not empty. But basically it will happen at most the
number of nodes times, because it starts with all the nodes in the data structure. And then just
extract next node for each time, and the size increases one by one.

Dijkstra's Algorithm: Proof of Correctness


Now let's prove that Dijkstra's algorithm actually finds correct distances from the original node to all
the nodes in the graph. And the lemma states that when a node is selected as the node with a
minimum dist value out of all nodes outside of a known region, it's dist value is actually equal to the
correct distance from origin to this node. Let's prove that.
Play video starting at 22 seconds and follow transcript0:22
Again, by contradiction. Let's assume this rule is broken at some point. Let's select first such moment.
So this moment, we have some known region, R, which contains node S and maybe some other nodes
colored in black. And all the nodes colored in white are outside the known region. So C, D, E, F, and G,
in this case. And they have some dist values which are written in blue near the corresponding nodes.
What will happen next is node C will be selected as the node with the minimum dist value out of all
the nodes outside known region. And we suppose that this dist value of six is wrong. What we know
about dist values is that they are upper bounds on the correct distances, that is if we have some dist
value then we can also show some path of exactly the length equal to this dist value. And so the
shortest path to this node Is less than or equal to this value. So there can be only two cases. Dist value
is equal to the correct distance, but we assume it is not. And in this case it is strictly bigger than the
correct distance. So we know that dist value of six is strictly bigger than the length of the shortest
path from origin to node C. Now let's think about this shortest path. It starts in the node S, which is
inside known region, and it ends in the node C, which is outside of the current known region. So there
is some edge which goes from inside known region to outside of the known region in this shortest
path. Let's consider this edge. This can be an edge, for example, from B to F. It can also go directly
into C, that doesn't matter.
Play video starting at 2 minutes 11 seconds and follow transcript2:11
What matters is what happens next. After it goes from known region to outside, it also goes through
some path of known negative length, or maybe this path is empty with the edge from known region
outside of it goes directly to C, but anyway this rest of the path is non-negative. What it means that if
we just considered the first part of this path which ends in the end of the edge from known region to
outside, now this part of the path is also strictly less than the dist value of C, strictly less than six. So
we know that there is a path which goes from S, goes somewhere inside the known region and then
goes outside and its length is strictly less than six.
Play video starting at 3 minutes 3 seconds and follow transcript3:03
But it means that actually some of the edges, it means that this edge could be relaxed in the current
situation, because we know for sure that the distance estimate for node B is equal to the shortest
path from S to B, because the first moment when this rule breaks is with node C, it was not broken
with node B. Okay? So, the distance estimate of node B is exactly equal to the shortest path from S to
B. When we add the length of the edge from B to F to it, we will get this same thing as we get when
we go by shortest path from S to C, but take only the part of it, which adds in the node F.
Play video starting at 3 minutes 52 seconds and follow transcript3:52
So if we add to the value of B, the value of the length of the edge from B to F. We will get value very
which is strictly less then six. And so it is strictly less than this value of F because this value of F is
greater than or equal to the dist value of E. So we could relax this edge but we didn't do it for some
reason. That's we know that we've relaxed all the edges from the known region to outside of the
known the region. So this contradicts our algorithm. Now this contradiction proves that this rule is
actually not broken and when we select the node with the minimum distance value outside of the
known region, it's dist value is actually equal to the correct distance from origin to this node. And this
proves that Dijkstra's algorithm finds correct distances from the region to all the nodes in the graph.

Dijkstra's Algorithm: Running Time


Now let's estimate the running time of Dijkstra's algorithm. In the beginning it just initializes dist
values and prev values and that takes time proportional to the number of nodes. But our estimate will
be bigger than that, so we just ignore this part. And other than that, it creates a data structure to
store dist values and extract minimums from it.
Play video starting at 22 seconds and follow transcript0:22
It extracts the minimum node from this data structure. V times, where V is the number of nodes in
the graph, and also it examines the edges. Each edge is examined, at most once, when the start of this
edge is processed. And during processing, it updates dist value, pref value, and changes priority in the
data structure. Updating dist value and pref value is constant time, so the main part is in the
ChangePriority part.
Play video starting at 53 seconds and follow transcript0:53
So now this running time estimating depends on how do you actually implement The data structure
from which you do ExtractMin, ChangePriority, and we should build by MakeQueue.
Play video starting at 1 minute 6 seconds and follow transcript1:06
One way to implement it is just using an array. Actually we'll need two arrays. First array is just the
size V, where you store in the cell number i, you store the dist value of the node number i. But you
also need to remove nodes from this array, so you'll use another boolean array, where you store a
flag. Whether this node is still in the data structure or is is not already. So to build such arrays you
need time proportional to the number of nodes, because it's basically, write down these values for all
the nodes and you Write all the flags, you say that they are true. Then, each ExtractMin operation in
the time proportional to the number of notes. because you need to go through the whole array, check
whether this particular node is considered to be in the data structure or not. Just look up in the
secondary, and then if it is then try to proof the minimum, and after you found the minimum, you just
mark the flag as false, it's now it's no longer in the data structure, and you take the minimum value
from there array. So each extractment of operation works in time proportional to v and we make v
times, so this is v squared time. Now, change priority operation is very easy in case of array, because
to priority you just need to change the disvalue of one node so you just get the number of that node
And you just change the value
Play video starting at 2 minutes 47 seconds and follow transcript2:47
in the first array corresponding to that node, so that takes constant time. And so in this case, the total
complexity is V + V squared + E. And V squared is bigger than V obviously, and also V squared is bigger
than the recall to E, because There can be at most one edge between each pair of nodes so V squared
is the leading term and so our running time is big O V squared for the array implementation;

and there is another way to implement this data structure using a binary heap or priority cube built
on top of binary heap. We know that to build a binary heap from an array of size V we need time
proportional to V. We now that ExtractMin works in the logarithmic time, so the total time for
ExtractMin will be V log V. And the tricky part is ChangePriority ChangePriority operation can be
implemented in the binary heap, but it is a little bit tricky. So instead of implementing additional
operation in the binary heap, we can actually cheat and we can just insert a new element in the heap
each time we need to decrease the dist value for some node. So We need to improve our dist value
and to improve it means to decrease it. What we can do is we can just insert another element, a pair
of a node and its dist value estimation into the priority queue. So, then when the time comes to
extract that node The pair with the minimum dist value will be selected and you will extract it with
the minimum dist value which was found by this time. And if at some other point of time you will also
try to extract
Play video starting at 4 minutes 32 seconds and follow transcript4:32
this same node but with different dist value, you can just ignore it, you can just write down that you
already processed this node. Extract it again from the q, and just don't do anything with it. So, this will
increase the size of the heap and so refreshes will be slower a little bit, but they won't be much
slower because how many elements can be in the and in this priority queue? It will be no more than
the number of edges because each time you change priority you examine some edge, and you
examine some edge at most once. So total number of change priority operations is the number of
edges. And so you will add at most E new elements in the heap.
Play video starting at 5 minutes 18 seconds and follow transcript5:18
In addition to the values which were there initially. And we know that e is at most v squared. So
logarithm of v is on the same, it's like at most two logarithms of v because logarithm of v squared is
just two logarithms of v. So the last part, the total time for ChangePriority is proportional to E
multiplied by logarithm of V. And the final estimation is V + E logarithm V. Which is much better than
v squared in the case when there are no edges. Because there are very many edges like on the order
of v squared than v squared will be equal to v squared log v. Which is worse than in the case of area
implementation. But if the number of edges less than v squared the graph is far from being. Full graph
with all the edges, then this can be much, much faster than the array-based implementation.

In conclusion, you now know the algorithm to find the minimum time to get from work to home for
example. And actually you also know the algorithm to find The fastest route itself, because the
reconstruct path algorithm will be the same as you have in the prev first search. Because you have the
same prev values and you can just reconstruct the path from them. And you know that this algorithm
works for any graph with non-negative edge weights. So you can not only find, Fastest routes but you
can also find shortest routes and routes which are minimum in some other senses and you know that
it works either in time quadratic and number of nodes in the graph or in time V plus E log V depending
on the implementation. Either array based implementation or binary heap based implementation.

Slides and External References

Slides
10_shortest_paths_in_graphs_2_dijkstra.pdfPDF File

Reading
Sections 4.3 and 4.4 in [DPV]
Visualizations
Dijkstra'a algorithm by David Galles

References
[DPV] Sanjoy Dasgupta, Christos Papadimitriou, and Umesh Vazirani. Algorithms (1st Edition).
McGraw-Hill Higher Education. 2008.

Currency Exchange
Currency Exchange
Hi, in the previous lesson you learned to find shortest paths in the graph 
with non-negative edge weight. 
In this lesson, you will learn to find shortest paths 
even in the graphs where some of the edge weights can be negative. 
We will explore that using a problem about currency exchange, 
which doesn't seem to have anything to do with shortest paths. 
But we will soon find out it does. 
So the problem about currency exchange is that 
you are given some possibilities that you can convert currency into another. 
For example, US dollars to Russian rubles, 
or rubles to Euros, or Euros to Norwegian crowns. 
And what you want to do is take your initial $1,000 and 
convert it into Russian rubles. 
Potentially doing many conversions on the way, 
such that you get as many Russian rubles as you can in the end. 
So one question is what is the maximum amount of rubles you can get? 
And another question is actually can you get as many as you want? 
And the same question about US dollars. 
Can you get as many US dollars as you want? 
It may seem that if you can get as many rubles as you want, 
then you will also be able to get as many dollars as you want. 
But that's not always the case, it depends on your restrictions. 
For example, if you lived in the USSR, you might be able to get some rubles for 
dollars you got from your foreign country trip. 
But it's very unlikely you could buy some dollars inside USSR. 
So in this problem we assume that you have some restrictions on which 
directions you can convert currencies and what are the rates. 

So this is an illustration from Wikipedia article called Triangular Arbitrage which illustrates the fact
that sometimes it is possible to make three trades and exchange first dollars to Euros, then Euros to
Great Britain pounds. And then pounds to US dollars, so that you generate some profit. In this case,
given $5 million, you generate a profit of 25,000, just by doing those three trades. These triangular
arbitrages exist very rarely and only due to some market inefficiencies. But in theory, this is possible.
And this example has some numbers on the edges. For example, the rate of conversion from dollars
to Euros is 0.81. And you get 1.19 Euros per British pound. And you get 1.46 dollars per British pound.
So given these conversion rates and of course those should also count for commissions and any other
money that you actually don't get. So given those numbers, you determine exactly how many dollars
will you get from 5 million initial if you do all the trades shown on the picture. So we will be discussing
this problem, how much you can get of each currency and if it's actually possible to get more of the
currency you had initially doing some trades. So how does this conversion work?
If we have some path, if we convert first from dollars to Euros then Euros to pounds then some other
conversions, then we come to Norwegian crowns and then we finally convert them to rubles. Then
how many rubles do we get per 1 US dollar? Well, it is determined by the product of the conversion
rates written on the edges. So for example, if for $1, we get 0.88 Euros and for 1 Euro we get 0.84
pounds, and so one, and for 1 Norwegian crowns we get 8.08 rubles. Then for 1 US dollar we will get
the product of all those numbers in rubles. That's how conversion rates work, if you account, again,
for commissions and other payments in each trade inside this number, which right on the edge.
Play video starting at 4 minutes 21 seconds and follow transcript4:21
So what are the possibilities for exchanges? It may look that there is only one way. You can just
convert from dollars to rubles, or maybe use some intermediate currency. But actually, there is an
infinite number of possibilities. For example, in this particular case, you could exchange dollars to
Euros, Euros to pounds, pounds to dollars, and go through this cycle for as many times as you want,
then convert to rubles. Or convert to Euros and then to rubles. So there are many, many possibilities,
and the result of each such path through the graph is the product of the numbers written on the
edges of this graph.
So the problem formulated in mathematical terms, is that if you're given a currency exchange graph
with weighted directed edges denoted by ei, between some pairs of currencies. Maybe you can
convert some pairs of currencies but you are not able to convert some others. And this graph doesn't
have to be symmetrical or undirected. So it is possible that you can convert dollars to rubles but not
vice versa. And each edge has a conversion rate corresponding to this pair of currencies and it written
as a weight of this edge r with index ei.
Play video starting at 5 minutes 46 seconds and follow transcript5:46
And what you want to do is maximize the product of the weight of the edges over all positive paths.
And a potentially infinite number of paths, from the node corresponding to US dollars to the node
corresponding Russian rubles in this graph.
Play video starting at 6 minutes 5 seconds and follow transcript6:05
Okay, and we could substitute these two currencies by any other pair, it doesn't matter.

Currency Exchange: Reduction to Shortest Paths


So now, I want to reduce this problem to the problem about shortest paths, and we will do that using
two standard approaches. First, we don't know what to do with this product, so instead of products of
weights we want sums of weights, like in the shortest paths problems. And we will replace products
with sums by taking logarithms of weights, and I will show that in a minute. And another problem is
that we need to maximize something in this problem, while shortest paths are all about minimizing.
So we'll have to also negate weights to solve minimization problem instead of maximization.

Play video starting at 36 seconds and follow transcript0:36


First let's talk about taking the logarithm is a known rule that x is the same as two to the power of
logarithm of x over two. So, we can take any product of two number, like x and y and rewrite x is two
the logarithm of x and y as 2 to the power of logarithm of y. And then, x y is equal to 2 to the power of
logarithm of x by 2 of the power of logarithm as y, and now we know the rule of summing the powers.
So this is equal to 2 to the power of logarithm of x plus logarithm of y
Play video starting at 1 minute 19 seconds and follow transcript1:19
So if we want to maximize product effects on Y. This is actually the same of maximizing the sum of
logarithm of X and logarithm of Y because if he sum becomes bigger then 2 to the power of the sum
becomes bigger. And if the sum becomes smaller, then 2 to the power of sum also becomes smaller.
This is true not only for 2 numbers. For example, if we have 3 specific numbers. 4, 1, and 1 or 2 which
one to multiply. On one hand, we get 2 which is 2 to the power of 1. On another hand, if we sum up
the logarithm Terms of 4, 1, and one-half, we get sum of 2, 0, and -1. And the logarithms can be both
positive and negative. They're positive when the number is bigger than 1 and they're negative when
the number is smaller than 1, and they're 0 when the number is equal to 1. So in this case, we get the
sum of 1, which is the same as the power to which we Exponentiate our 2. So you see that it works
not only for 2 numbers, but also for several numbers. And in general, to maximize the product of k
numbers, r would EJ. It is the same as to maximize the corresponding the sum of logarithms of these
numbers.
Play video starting at 2 minutes 43 seconds and follow transcript2:43
And now that this only works if all these numbers are positive, because we cannot take logarithms of
negative values and we also cannot take logarithms of 0. But if all the exchange rates are positive
numbers, hopefully then we just take logarithms and we reduce our problem of maximizing product
of some numbers to maximizing sum of some numbers.
Now, we want to go from maximization to minimization but that is easy. If you want to maximize the
sum of logarithms, it is the same as minimize minus the sum.
Play video starting at 3 minutes 20 seconds and follow transcript3:20
And we will also want to work just with the sum, not with the minus sum, so we can insert this minus
inside the sum incorporated. And so finally, we get that maximizing the sum of logarithms is the same
as minimizing the sum of minus logarithms.
Play video starting at 3 minutes 38 seconds and follow transcript3:38
Trading those two ideas, we've got the following reduction. We replace the initial edge weights,
conversion weights, rei by minus algorithm of rei. And we find the shortest path between the nodes
corresponding to USD and the nodes corresponding to RUR in the graph.
Play video starting at 3 minutes 58 seconds and follow transcript3:58
And this is equivalent to the initial problem of how many rubbles you can get from $1000. So now, it
looks like we've solved the problem because we can create the currency exchange graph with the
conversion rates, we can replace those rates with logarithms. And we can find the shortest path from
USD to RUR using Dijkstra's Algorithm, which were learned in the previous lesson. And then, we can
just do the exchanges corresponding to the shortest path in the graph which you found. However,
that doesn't exactly work.
Because Dijkstra's algorithm heavily relies on the fact that the shortest path from s to t goes only
through vertices that are closer to s than t. And this is because the edge weights are positive, but if
edge weights can be negative, this is no longer the case and the example is below. If we used
Dijkstra's algorithm as soon as it saw only two edges from s to A and to B. One of them with weight
five, and another with weight ten. It would decide that the shortest path to S to A is exactly five,
because we cannot improve it. In this example, we can improve it. We go from S to B, then from B to
A, and the path will be, already, minus ten, which is much less than five. So Dijkstra's algorithm
doesn't work in such cases. Such an example is also possible in the currency exchange problem. Here
is a graph with realistic conversion rates between ruble, euros and U.S. dollars. And our goal is to
convert rubles into US dollars in the most profitable way. And it turns out that if we take minus
logarithms of these conversion rates, then although the number on the edge from Rubles to US
Dollars is less than the number from Rubles to Euros. It is still beneficial to go through Euros to US
Dollars
Play video starting at 6 minutes 2 seconds and follow transcript6:02
because of the negative edge between euros and US dollars. And it's true that if you multiply the
conversation rate between rubles and euros, and then between euros and dollars. It will be slightly
bigger than if we convert directly from rubles to dollars.
Ppt slides
Play video starting at 6 minutes 22 seconds and follow transcript6:22
So, all problems in the graphs with negative weights come from negative weight cycles.
Play video starting at 6 minutes 33 seconds and follow transcript6:33
For example, in this graph, we have a negative cycle abc.
Play video starting at 6 minutes 39 seconds and follow transcript6:39
What it means is that we can go from a to b than from B to C, than from C to A. And if we add those
weights we get -1. So the sum of the edges on the cycle is negative. And because of that, if you want
to convert for example from S to A, if you want to go from S to A and find the shortest path, this is not
possible. Because we can go from S to A, use distance of 4 But then we can just go around the cycle A
B C, A B C, A B C many, many times, as many as we want. And the distance will only decrease. So the
distance from S to node A is actually minus infinity, is not defined. You can do as short a path as you
want. And the same is true about nodes B and C of course because they are on the cycle. So you can
do the same thing with them. And the same is also true about node D because it is reachable from the
cycle. So we can go to the cycle and make a lot of round trips on the cycle and then go to node D from
either from B or from C. So all these nodes. Are the infinitely close to S, like the shortest path is minus
infinity. And it turns out that in the currency exchange problem a cycle can potentially make you a
billionaire, if you are lucky and if you have enough time for that. But you'll learn how to do that a little
bit later in this lesson.

Bellman-Ford Algorithm
Hi, in this video you will learn Bellman-Ford's algorithm, 
which is an algorithm for 
finding shortest paths in graphs where edges can have negative weight.
Play video starting at 9 seconds and follow transcript0:09
Actually, do you remember the naive algorithm from the previous lesson about 
Dijkstra's algorithm? 
It turns out, it's not so naive and 
Bellman-Ford's algorithm is almost the same as the naive algorithm. 
So, that naive algorithm just relaxed edges while dist changed, 
and at some point it stopped. 
We didn't estimate the running time of that algorithm. 
But it turns out, that this algorithm has
Play video starting at 36 seconds and follow transcript0:36
benefit over Dijkstra's algorithm that it works even for negative edge weights. 
It is a little bit slower than Dijkstra's algorithm but 
it works in graphs with any edge weights.
So, here is Bellman-Ford's algorithm. And, it takes as input, graph and original node again. And we will
find shortest paths from this original to all the nodes in the graph. And there is an additional
comment that this algorithm assumes that there are no negative weight cycles in G.
Play video starting at 1 minute 11 seconds and follow transcript1:11
Otherwise, it will work still but it won't return correct distances for some of the nodes, so. So, if you
know that there are no negative weight cycles in G, this algorithm will give you the answers you need.
If there are negative weight cycles, we'll discuss later what to do in this case.
Play video starting at 1 minute 31 seconds and follow transcript1:31
So, this algorithm uses the same dist values and prev values as b f s and Dijkstra's algorithm. It
initializes them the same way, infinitely for distances apart from the origin node s and prevs just point
to nowhere. And then we repeat exactly V- 1 times. V is the number of nodes in the graph. We relax
all the edges in the graph in order.
Play video starting at 2 minutes 1 second and follow transcript2:01
So, this is the Bellman-Ford's algorithm. And actually, this repeat V- 1 times is excessive. Actually, we
can just return to the naive algorithm interpretation when it repeated relaxation of all the edges until
no relaxation is possible.
Play video starting at 2 minutes 23 seconds and follow transcript2:23
So, this can be done, and this will even work faster than Bellman-Ford, in the case where there are no
negative weight cycles. However, we write this pseudo-code in this form just because it is easier to
prove the algorithms correctness this way. But, you can just know that if at some iteration during
these V- y iterations nothing changed, no edge was actually relaxed, we can just stop there and the
distances will already be correct.
Play video starting at 2 minutes 58 seconds and follow transcript2:58
Now, let's estimate the running time of this algorithm. And I state that it's proportional to the product
of number of nodes and number of edges. So, that is longer than Dijkstra's algorithm which was V
squared in terms of array based implementation and even e + v log v in terms of heap, binary heap
implementation. So, this is longer than that but it works with negative edge weights, so it's good. So
initially, we just initialize dist and prev values in time proportional to number of nodes. And then we'll
do V- 1 iteration. Each iteration takes time proportional to the number of edges because relaxation is
a constant time procedure. So, totally we get time proportional to VE.
PPT Slides
PPT Slides
Play video starting at 3 minutes 51 seconds and follow transcript3:51
And now, we'll look at an example of how Bellman-Ford's algorithm works on a real graph. So the
original is, again, s. And the numbers 0 and infinity in blue are the dis values. And the numbers near
the edges are the edge weights.
Play video starting at 4 minutes 10 seconds and follow transcript4:10
So, what we do is we take all the edges in order starting from the edge with weight 4. And we try to
relax it. And we improve the dist value for A from infinity to 4. We take the next edge with length 3
and we improve the dist value of B from infinity to 3. We take the edge with weight -2 and we further
improve the dist value for B from 3 to 2. Then, we consider edge from A to C and improve infinity to 8.
Then, we consider edge from B to C and improve it further from 8 to -1. Then, we consider edge from
B to D and improve from infinity to 3. And, we consider the edge from C to D and improve from 3 to 1.
This is just one iteration of Bellman-Ford's algorithm. Now, let's see what happens on the next
iteration. We consider edge from S to A, it doesn't improve anything. This edge, doesn't improve
anything. This edge doesn't improve.
Play video starting at 5 minutes 19 seconds and follow transcript5:19
This edge also doesn't improve. And this doesn't improve. And this doesn't improve. And this does
improve. And so already, on the second iteration, nothing can be improved. So, we actually can stop
the algorithm just after one iteration. And it will take, instead of VE operations, just E operations
because we made just one iteration over all the edges. And often, it will work just this way in practice.
But for some graphs, of course, we'll need many more iterations. So now, we already know the final
distances that the Bellman-Ford algorithm returns. And in the next video, we will prove that this
algorithm returns correct distances in the absence of negative weight cycles.

Bellman-Ford Algorithm: Proof of Correctness


Hi, in this video, we will prove that Bellman-Ford's algorithm returns correct distances from origin
node to all the nodes in the graph in the absence of negative weight cycles.
Play video starting at 11 seconds and follow transcript0:11
First, we need the following lemma. After k iterations of relaxations inside Bellman-Ford's algorithm
for any k, then if we take any node u, dist[u] after these k iterations will be equal to the shortest path
length from S to this node u, but among all the paths that contain, at most, k edges. So not all the
possible paths, just paths which contain 0, 1, 2, or, at most, k edges. For example, after one iteration,
we state that dist values of all the nodes will contain the best possible shortest path, which consists
from 0 or 1 edges. We'll prove this lemma by mathematical induction. And the base case is after 0
iterations, all the dist values are infinity and dist[S] = 0. And this is correct because for S, there is a
path of 0 edges, which has lines to 0. And correct distance from S to S is 0. And dist value is also 0.
And for all the other nodes, there is no path that contains 0 edges. And so the shortest path that
contains 0 edges and goes through those nodes is infinity.
Play video starting at 1 minute 32 seconds and follow transcript1:32
Now the induction step is if we proved for paths with length at most k, then we need to prove it for
paths with length at most k + 1.

So, we know that before iteration number k + 1, dist[u] is the smallest length of a path from S to u,
which contains at most k edges. So what do we do on the k + 1-th iteration? Each path from S to u
goes through one of the incoming edges, or like from some node v into node u. And also, parts of
length k+1 go this way. They go k edges to some known v, and then they go through this edge from v
to u.
Play video starting at 2 minutes 21 seconds and follow transcript2:21
So when we try relaxing edge from v to u, we compare the current dist value, which is the smallest
path length out of the paths, which contain at most k edges, we compare it with the smallest length
of a path from S to u, which contains at most k + 1 edge and goes through v.
Play video starting at 2 minutes 43 seconds and follow transcript2:43
So all parts which contain at most k+1 edges and go to u, they go through one of the incoming edges
(v, u). And so we will compare with all the possible paths that contain at most k+1 edges and go from
S to u. So if initially, we had the best paths, which contain at most k edges, after new iteration, we
have the best paths, which contain at most k+1 edge. Because we add one last edge from v to u to the
best path, which contains at most k edges from S to v.
So the lemma is proved, and now we have two corollaries. First that if a graph doesn't have any
negative weight cycles, then Bellman-Ford algorithm correctly finds all distances from the starting
node S. Why is that? Because it does v-1 iterations. And after v-1 iterations, for each node, its dist
value contains the shortest path out of all paths that contain at most v-1 edges. But if there are no
negative cycles, then any path, any shortest path contains at most v-1 edge. Because if it contains at
least v edges, then there will be a cycle inside it. And the cycle will be non-negative, so we can just
remove this non-negative cycle from the path, and it will improve or stay the same. So any shortest
path contains just v-1 edges or less, and so Bellman-Ford algorithm correctly finds the best shortest
paths for each node. Another corollary is harder,
Play video starting at 4 minutes 31 seconds and follow transcript4:31
even if there is a negative cycle in the graph, that doesn't mean that there is no correct distance
estimation from origin to some particular node because that particular node may be not reachable
from any of the negative weight cycles. So if there is such node u, then there's no negative weight
cycle from which it is reachable. Then Bellman-Ford algorithm will correctly find distance from after
this node by the same reason because if we have any shortest path from S to u,
Play video starting at 5 minutes 11 seconds and follow transcript5:11
it cannot contain cycles because u is not reachable from any negative weight cycles. And so the cycle
on the path from S to u must be non-negative, and so we can just remove it from the path, and the
path will improve or stay the same. So, any path from S to u doesn't contain cycles, and that means
that any path from S to u, which is the shortest path, has at most v-1 edges. And this means that
Bellman-Ford's algorithm will return actually a correct value for the distance from S to u. And in the
next video, we will look into what to do with the negative cycles when they're present in the graph
and how to use it for your own profit, potentially making yourself a billionaire based on currency
exchange

Negative Cycles
Hi, in this video you will learn to deal with negative cycles in the graphs. You will learn to detect
whether there is a negative cycle in the graph and to find the negative cycle if you know there is one.
And you will use that in the next video to achieve infinite arbitrage. But before that, you really need
to learn to detect negative cycles in the graphs, and this lemma will help you.
Play video starting at 24 seconds and follow transcript0:24
So, it says basically that there is a negative weight cycle in the graph if and only if you make one
additional iteration of relaxation in the Bellman-Ford's algorithm, and some edge is relaxed.
Play video starting at 38 seconds and follow transcript0:38
First, let's prove that if there is some change in the last step of Bellman-Ford's algorithm, then there is
a negative cycle. Assume for the sake of contradiction that there are no negative cycles.
Play video starting at 54 seconds and follow transcript0:54
Then it means that all shortest paths from S to any node in the graph contain at most, V- 1 edges.
We've already discussed that. If they contain at least V edges, then there is a cycle. And this cycle is
non-negative so it can be removed, and the path will be either improved or just stay shortest path. So,
know this value can actually be updated on iteration number V. Because we already found the
optimal shortest path length with paths that contain, at most, V- 1 edges, and we cannot improve
anything after that. So this is a contradiction with the fact that the last iteration actually relaxes some
edge. And so if it does, then there is a negative cycle in the graph.
Play video starting at 1 minute 45 seconds and follow transcript1:45
Now let's prove, in another side, that if there is a negative weight cycle, then there will be a relaxation
on the iteration number V. Suppose again, for the sake of contradiction, that there is a negative
weight cycle. For example, a cycle of length three a b c a, but there are no relaxations on the iteration
number V.
Play video starting at 2 minutes 8 seconds and follow transcript2:08
Now let's look at the following three inequalities. These inequalities basically state that edges (a, b),
(b,c), and (c,a) cannot be relaxed on the V-th iteration. But if we sum these inequalities, on the left
side, you will have sum of these values for a, b, and c.
Play video starting at 2 minutes 28 seconds and follow transcript2:28
And on the right part, we will have sum of these values of a, b, and c plus sum of weights of the edges
of the cycle. And from this sum of inequalities that follows, that the sum of the weights of edges is
non-negative. But we know that the cycle is a negative weight cycle, so this is a contradiction and so
we prove by contradiction that there will be some relaxation on iteration number V. So we proved our
lemma, and now we know how to detect negative weight cycles in the graph. We just do one
additional duration of Bellman-Ford's algorithm, and it doesn't increase its asymptotic complexity, its
training time.
Play video starting at 3 minutes 12 seconds and follow transcript3:12
And now we can determine whether the graph is good or bad, whether it contains negative cycles or
not.
Play video starting at 3 minutes 20 seconds and follow transcript3:20
Now, not only to detect what are the reason negative cycle or not, also want to find some negative
cycle itself, at least one. Maybe they are many of them, but we want some negative cycle. So first, we
still need to run executive V iterations of Bellman-Ford's relaxation. And as we know, if there is
negative cycle, then for some node v, it will be relaxed. This value will be decreased on the last
iteration, so save this node. And I state that this node is definitely reachable from a negative cycle.
Because if it was not reachable from the negative cycle, then any shortest path to this node will
contain at most V- 1 edge, and it won't be improved on the iteration number v.
Play video starting at 4 minutes 12 seconds and follow transcript4:12
So it is reachable from a negative cycle, and also it is reachable from a negative cycle with at most, v
steps. Because if you have shortest path from s to v,
Play video starting at 4 minutes 27 seconds and follow transcript4:27
which contains negative cycle inside, even if there are more than the edges from this negative weight
cycle to V,
Play video starting at 4 minutes 37 seconds and follow transcript4:37
then there is another cycle on the shortest path. And it also has to be negative, because otherwise we
can remove it from the shortest path. So v is definitely reachable from a negative cycle, with at most v
steps.
Play video starting at 4 minutes 52 seconds and follow transcript4:52
And we remember the previous node from which each node was updated last time. We store it in the
prev values. So if we go by prev links from V, and we do that at least three times, we will be already
definitely on this negative weight cycle. Because we will already return to the cycle by at most the
steps, and then we will just go around the cycle for some amount of duration. So if we go by V times
back, we will be on the negative cycle. Then, only thing we need to do is just to save the position
where we started and then go once around the cycle until we come to the same node and we will
know that the nodes we saw during this round trip are the nodes of the negative cycle. So this is the
algorithm, and it works in the time of Bellman-Ford's algorithm plus proportional to the number of
nodes. So basically, the same running time as Bellman-Ford's algorithm.

Play video starting at 6 minutes 3 seconds and follow transcript6:03


So now the question is, again, if you have a negative cycle in your graph of minus logarithms, it
doesn't mean that it is always possible to get as many rubles as you want from $1,000 dollars. And
unfortunately, that's not always the case. For example, in this graph, there is a negative cycle
between euro, British pounds and Norwegian crowns. And you can check that if you multiply the
conversion rates on the edges of this triangle you will get more than one and it means that, for one
euro can get more than one euro. And this means in turn that you can get infinite arbitrages if you
just make trades along the edges of this triangle for as many times as you want. But there is no
possibility, in this particular example, to exchange euros for dollars. So you cannot exchange rubles
for euros then make many loops around this negative weight cycle. And then exchange euros to
dollars. So this is not sufficient, and you cannot actually get any dollars from rubles, you cannot get
any rubles from dollars in this particular case. And in the next video, you will learn how to determine
whether it is possible to have an infinite arbitrage from rubles to dollars, or from any other currency
to any other currency. And not only to detect that, but to actually find a way to implement it if it's
possible.

Infinite Arbitrage

Hi! In this video, you will finally learn how to detect whether infinite arbitrage from some source
currency to some target currency is currently possible. And not only that, you will also learn how to
actually implement it if it is possible. But first, let's learn a criterion for the existence of infinite
arbitrage. The lemma states that if we consider some source currency S, which we want to exchange
into some target currency u, and get as much as we want of currency u from a unit of currency S. And
it is possible if and only if the node u corresponding to the currency u in the currency graph is
reachable from some node w, which was relaxed on iteration number big V. Where big V is the
number of nodes in the currency graph or just the number of different currencies which we consider.
Play video starting at 55 seconds and follow transcript0:55
So we run the Bellman-Ford Algorithm for exactly big V iterations and if some node w was relaxed on
the last iteration, and some node u is reachable from it, then it is possible to get as much as you want
of u from the source currency S, and vice versa. If it is possible to get as much as you want of currency
u, then it must be reachable from some node w, which is going to be relaxed on the last iteration on
the Bellman-Ford's algorithm.

Play video starting at 1 minute 26 seconds and follow transcript1:26


So let's prove this lemma. First, we'll prove that if some currency u is reachable from w, which was
relaxed on iteration number V, then the infinite arbitrage is possible.
Play video starting at 1 minute 39 seconds and follow transcript1:39
To see that, first note that, as w was relaxed on iteration number V, then we know from the previous
video that w is reachable from some negative weight cycle, which in turn is reachable from the node
S.
Play video starting at 1 minute 57 seconds and follow transcript1:57
It corresponds to the picture on the slide.
Play video starting at 2 minutes 0 seconds and follow transcript2:00
And so we can get as much as we want of currency w from currency S. Because we can go from S, to
the node x of the negative cycle, which is reachable from S, then go as many times as we want,
through the negative cycle, return to the node x, and then go from this node x, to node w. And using
this way, we will get as much as we want of currency w, and then we know that u is reachable from w.
We know that w is reachable from the negative cycle. The negative cycle is reachable from S. So we
can use this hallway from S to negative cycle, then go through the negative cycle as many times as we
want. Then go to w from there, then go from there to u and this way we will get as much as we want
of currency u. So infinite arbitrage is actually possible.
Play video starting at 2 minutes 56 seconds and follow transcript2:56
Now let's prove the other way around. That if we can get as much as we want of currency u from
currency S, then u must be reachable from some node w, which is going to be relaxed on the last
iteration of the Bellman-Ford Algorithm.
Play video starting at 3 minutes 14 seconds and follow transcript3:14
So first, let L be the length of the shortest path from S to u with at most V- 1 edges.
Play video starting at 3 minutes 23 seconds and follow transcript3:23
Then we know that after V-1 iterations of the Bellman-Ford's algorithm, dist[u] is equal to exactly L.
Play video starting at 3 minutes 33 seconds and follow transcript3:33
And we know that exists infinite arbitrage from S to u and so there exists some path shorter than L,
strictly shorter, because we can get
Play video starting at 3 minutes 47 seconds and follow transcript3:47
a path for any length which is arbitrarily small. And so in particular there it is, the path which is
shorter than L. And so the dist[u] will be decreased at some point. It will become less than L also. But
we know that by the iteration number V- 1, it is still equal to L. It is not less than L. So it will be still
decreased even more on some iteration after iteration V- 1. So it will be decreased either on iteration
number V or on some of the further iterations.
Play video starting at 4 minutes 27 seconds and follow transcript4:27
Now let's note that if at some point, some edge from node x to node y was not relaxed, and also node
x was not relaxed itself, then edge (x, y) will not be relaxed on the next iteration because nothing has
changed. The dist[x] is the same.
Play video starting at 4 minutes 45 seconds and follow transcript4:45
And dist[y] is not bigger than dist[x] plus length of edge (x, y) and this didn't change from this iteration
to the next iteration. So this edge won't be relaxed on the next iteration. And this means that if node
y was relaxed at some iteration, then there is some node x from which there is an edge to this node.
And this node x was relaxed at some previous iteration. So basically, only those nodes which are
reachable from nodes relaxed on previous iterations can be relaxed at the current iteration. And so
we know that the node u will be relaxed at some iteration starting from iteration V and further, and
this means that u is reachable from some other node or maybe the same node. But from node which
was relaxed on iteration number V, note that u can be this node. For example, u can be relaxed
exactly at iteration number V. And as we consider node to be reachable from itself, then this is a
particular case, in which u was relaxed on iteration number V, and u is reachable from itself so, it is
possible to get infinite arbitrage to u. But in any case, we'll prove that if infinite arbitrage from S to u
is possible, then u is reachable from at least some node, maybe the same u which was relaxed on
iteration number V.
Play video starting at 6 minutes 14 seconds and follow transcript6:14
So now we have a criterion for existence of infinite arbitrage. How to apply that criteria.
So this is an algorithm to detect infinite arbitrage. So we first do exactly V iterations of Bellman-Ford
algorithm, and we save all the nodes which were relaxed on the last iteration, iteration number V.
And let this set of nodes be denoted by A.
Play video starting at 6 minutes 40 seconds and follow transcript6:40
Now we'll put all the nodes from A into the queue, and we'll use this queue for a breadth-first search.
So we'll start our breadth-first search not from a queue which contains just the first node, from which
we want to know all the reachable from it. But we will put all the nodes which were relaxed on
iteration number V into the queue, and then start breadth-first search. And then this breadth-first
search will find us all the nodes which are reachable from at least one of the nodes from the set A. So
those will be exactly those nodes for which infinite arbitrage is possible.
Play video starting at 7 minutes 18 seconds and follow transcript7:18
So, all those nodes and only those can have infinite arbitrage. And this is a way to find all target
currencies for which infinite arbitrage from a fixed source currency S is possible. But this is not
enough. What we want is to actually if infinite arbitrage is possible, is how to actually implement it.
Play video starting at 7 minutes 39 seconds and follow transcript7:39
And here is the next algorithm. So let's suppose we already detected that the infinite arbitrage to
some target currency u is possible, and we determined that using breadth-first search from the set A.
So we need to augment this breadth-first search and we will remember the parent of each visited
node, so that when a breadth-first search discovers a new node, it discovers it from some node which
was discovered previously. This is the parent of this node. And this is what allows us to reconstruct
the path from the source node to the end node in the regular breadth-first search algorithm. So we
will do the same thing, and this will allow us to reconstruct the path to the target currency u from
some node w, which was relaxed on iteration number V.
Play video starting at 8 minutes 30 seconds and follow transcript8:30
Then we'll use the algorithm from the previous video to find the negative cycle from which w is
reachable. Because we know that w was relaxed on iteration number V, and so it is reachable from
some negative cycle, which is in turn reachable from S. And if we go back by the parent pointers from
w, we will find this negative cycle. So we'll find both this negative cycle and the path from it to w. And
of course we can also find a path from the source currency to this negative cycle by, for example,
launching a regular breadth-first search from S until we encounter some node of that negative cycle.
And then combining all that, we will have a path from S to the negative cycle, through which we can
then go by as many iterations as we want. Then we'll go by a path from this negative cycle to w. And
then we'll go by a path from w to u, which you already know. And that gives us a way implement
infinite arbitrage from the source currency S to the target currency u.
Play video starting at 9 minutes 39 seconds and follow transcript9:39
So in conclusion, we can now implement the best possible exchange rate, in the case it exists and
there is no infinite arbitrage. And we can determine whether actually infinite arbitrage is possible
currently. And we can actually implement infinite arbitrage from a given source currency to given
target currencies.
Play video starting at 10 minutes 1 second and follow transcript10:01
And in a more general and abstract way we can find shortest paths, not only in the graphs where all
the edges are positive, like graphs of navigation systems with positive times to go through an edge
and length of the edge, but also in the graphs where negative edge weights are possible, such as the
graphs of currency exchange. So we can basically find shortest paths correctly in any graphs with
weighted edges. That's what we've learned in this lesson.

Slides and External References

Slides
10_shortest_paths_in_graphs_3_bellman_ford.pdfPDF File
Reading
Section 4.6 in [DPV]

References
[DPV] Sanjoy Dasgupta, Christos Papadimitriou, and Umesh Vazirani. Algorithms (1st Edition).
McGraw-Hill Higher Education. 2008.

Programming Assignment: Programming Assignment 4:


Paths in Graphs
You have not submitted. You must earn 2/3 points to pass.

Deadlin Pass this assignment by Aug 2, 11:59 PM PDT


e
Week 5
Algorithms on Graphs

Week 5
Discuss this week's modules here.

9 threads · Last post 9 days ago


Go to forum

Minimum Spanning Trees

In this module, we study the minimum spanning tree problem. We will cover two elegant greedy
algorithms for this problem: the first one is due to Kruskal and uses the disjoint sets data structure, the
second one is due to Prim and uses the priority queue data structure. In the programming assignment for
this module you will be computing an optimal way of building roads between cities and an optimal way of
partitioning a given set of objects into clusters (a fundamental problem in data mining).
Less
Key Concepts
 Explain what a spanning tree is
 Describe algorithms for computing minimum spanning trees
 Create an efficient program for clustering

Less

Minimum Spanning Trees

Video: LectureBuilding a Network

9 min

Resume

. Click to resume

Video: LectureGreedy Algorithms

4 min

Video: LectureCut Property

9 min

Video: LectureKruskal's Algorithm

15 min

Video: LecturePrim's Algorithm
13 min

Reading: Slides and External References

10 min

Programming Assignment

Purchase a subscription to unlock this item.

Programming Assignment: Programming Assignment 5: Minimum Spanning Trees

3h

Building a Network
PPT Slides
Hello and welcome to the next module, in which we will be talking about minimum spin entries. To
[INAUDIBLE] this title, this problem considers the following toy example. Assumes that we have six
machines in our office and we would like to join them. In doing that whereby putting wires between
some pairs of them such that each machine, each machine is reachable from any other machine. I
assume further that we can put wires only between some pairs of our machines. And for each
[INAUDIBLE] we're now [INAUDIBLE] responding [INAUDIBLE] why between them. For example in this
case we are allowed to put a Y between these two machines and of course this five. And we are not
allowed to join these two machines by wire.
Play video starting at 45 seconds and follow transcript0:45
Okay. Wire of the optimal solutions in this case is shown here on the slide. It is not difficult to check
that in this case indeed any machine is reachable from [INAUDIBLE] machine. For example to reach
the right machine from the left machine we would go as follows. First by this edge, then by this edge,
and then by this edge. The total cost of the shown solution is equal to two plus one which is three
plus three which is six plus four. Which gives us ten plus two which gives us 12. So the total cost is 12,
and this is actually not the only solution in this case because instead of using this wire, we may use
this wire.
Play video starting at 1 minute 35 seconds and follow transcript1:35
The result in solution is shown for you on the slide and it is not. Again, difficult to check that, in this
case, any machine is reachable from any other machine and that the total cost is equal to 12. We will
soon learn read the algorithms that will allow us to justify that in this, for example, the optimal total
cost is indeed 12. These two algorithms will also allow us to solve very efficiently in practice instances
consisting of thousands of machines. In our second example we have a collection cities and we would
like to build roads between some pairs of them such that there is a path between any two cities. And
such that the sum of the lengths of all the roads that we are going to build is as small as possible.
Okay? In this case the solution looks like this. And again we will soon learn how to find the solution
efficiently.
Formally the problem is stated as follows. [INAUDIBLE] graph H [INAUDIBLE] and we assume that this
graph is connected.
Play video starting at 2 minutes 46 seconds and follow transcript2:46
Okay and it is given together with positive edge weight.
Play video starting at 2 minutes 53 seconds and follow transcript2:53
What we're looking for is that subset of edges, e prime, such that if we leave only these edges in the
graph. Then the resulting graph Is connected and also the total cost of folds and edges at E prime is as
small as possible, okay? So why is the problem is called minimum spending tree? Well minimum
corresponds to the fact that we are looking for a subset of edges or minimum total cost or minimum
total weight. Okay. It is called spanning because we're looking for a subset of edges such that if we
leave only these edges, then the result in graph is still connected so it spans all the vertices in our
graph. And finally, the word tree corresponds those effect that in each solution, the set E prime is
going to be. Is going to form a tree.
Play video starting at 3 minutes 50 seconds and follow transcript3:50
We will prove this fact on the next slide.
Play video starting at 3 minutes 55 seconds and follow transcript3:55
Before proving that any optimal solution for minimum spanning tree problem any optimal solution in
prime forms a tree. Let me remind you a few useful properties of trees. First of all, just by definition, a
tree is an undirected graph that is connected and is acyclic, that is it contains no cycles.
Play video starting at 4 minutes 17 seconds and follow transcript4:17
Recall that we usually draw trees as follows.
Play video starting at 4 minutes 23 seconds and follow transcript4:23
So we draw them level by level and look like this. In this case, we actually.
Play video starting at 4 minutes 31 seconds and follow transcript4:31
We're actually talk about rooted trees. This is a rooted tree and in particular, this is a root of this tree.
At the same time, this graph
Play video starting at 4 minutes 46 seconds and follow transcript4:46
is also a tree with only difference that there is no root In this tree and it is not drawn level by level. So
once again, this graph is also connected and there are no cycle in this graph so it is a tree. There is no
root in this tree, however we can take a vertex, declare this vertex as a tree and then hang by this
vertex. And then this will allow us to draw this graph level by level,okay? the next property is that if
we have a tree with n vertices than it necessarily has n minus one edges. Why is that? Well let me
illustrate this by drawling something. Initially we have just one vertices and zero edges. So initially the
property has satisfied the number of edges. Is equal to the number of vertices minus one. Let then
introduce some new vertex. So we'll attach it by an edge to a previous vertex. Then we introduce
another vertex, probably another one, and another one. And so on. So each time when we introduce
a new edge Introducing new [INAUDIBLE] We also have a new edge. So initially we had one vertex,
and zero edges [INAUDIBLE] vertices and now we have, I'm sorry five vertices, now we have six
vertices and [INAUDIBLE] five edges. So still the property is satisfied. Note that we cannot introduce a
new edge without introducing a new vertex. Because if we introduce a new edge that would mean
that we are connecting two existing vertices by edge. For example like this, but this will necessarily
produce a cycle which means that this will give us a graph which is not at three. Okay?
Play video starting at 6 minutes 37 seconds and follow transcript6:37
The next property is that, actually any connected graph with
Play video starting at 6 minutes 43 seconds and follow transcript6:43
the number of edges equal to the number of vertices minus one, is necessarily a tree.
Play video starting at 6 minutes 51 seconds and follow transcript6:51
Well, let me emphasize that it is important that here we are talking about connected graphs.
Play video starting at 6 minutes 58 seconds and follow transcript6:58
Because for example if a graph has the property that the number of edges is equal to the number of
vertices minus one but it is not connected, that it is not necessarily a tree. A counterexample is the
following. Assume that we have four vertices and there is a [INAUDIBLE] and three vertices and there
is one isolated. One isolated vertex. In this case, we have four vertices, and three edges, right? So this
is not a tree, but at the same time this graph is not connected. There is no pass for example from this
vertex to this isolated vertex, okay?
Play video starting at 7 minutes 36 seconds and follow transcript7:36
And the last property says that an undirected graph is a tree if and only if there is a unique path
between any two vertices. And this is also not difficult to see first of all if we have a tree then of
course there is a unique path between two its vertices. On the other hand If we have to vertacise, and
there are at least parts between them, then this gives us a cycle, which means that this is not a three.
Okay, now let me get back to proving that any optimal solution [INAUDIBLE] problem is indeed a
three, for this, consider some [INAUDIBLE] solution. So in this case we have six vertices, and we
consider some way of joining them into some way of connecting them. I assume that our solution
looks like this. So you see that this is not a tree because there is a cycle. On these four vertices.
Play video starting at 8 minutes 43 seconds and follow transcript8:43
Well, in this cycle for any two vertices in this cycle, there are two paths. Right? For example, for this
vertex and for this vertex we can go either this way or this way. And this is true for any cycle, if we
have a cycle and two vertices on this cycle we can either go this way or this way. And this means that
there is some redundancy here. In this particular case, we can remove, for example, this edge from
this cycle. Or from this solution. This will only decrease the total weight of this solution and the
resulting set of fedres will still be connected. Great this proves that any optimal solution must be
acyclic, and since any solution is required to be connected, this proves that any optimal solution is in
fact a tree.

Greedy Algorithms
PPT Slides

The goal of this whole lesson is to present two greedy algorithms for the minimum spanning tree
problem. Before going in to the details of this algorithms, let me present you the higher level ideas of
both these algorithms. So the first one is the Kruskal And the second one is you need to Prim.
Play video starting at 22 seconds and follow transcript0:22
The high level idea of the algorithm by Kruskal is the following. So, we go through all edges in order of
increasing width, and we repeatedly at the next lightest edge which doesn't produce a cycle.
Play video starting at 38 seconds and follow transcript0:38
Alternatively, the main idea of the Prim's algorithm is to grow a tree repeatedly so initially it contains
just one vertex, then we attach a new vertex to it, then a new vertex to it. And we always attach the
new vertex to the current tree by the lightest available edge. Let me illustrate this again on a toy
example. So for Kruskal's algorithm, again, we repeatedly add the next lightest edge that doesn't
produce a cycle. Initially, our solution is empty, so we just add the first lightest edge. In this case, the
lightest edge has weight 1, so we just put this edge into our solution. And there is also another edge
which has cost or weight 1, so we also add it to our solution. The next one has weight 2. We add it. At
the same time, the next one, the next lightest available edge has weight 3. However, if we added it to
our current solution, this would produce a cycle. So we skip this edge. The next lightest available edge
has weight 4. We add it because it doesn't produce a cycle.
Play video starting at 1 minute 56 seconds and follow transcript1:56
Then, again, we try to add the edge with weight 5 because it is the next lightest available edge.
However, it produces a cycle. So we skip this edge, and instead we add the edge of weight 6, right?
This gives us a solution, and we will soon justify that this method indeed gives an optimal solution.
Now, to the Prim's algorithm. It works in a different way. So it repeatedly grows just 1, 3. For this, it
will select a root for this tree. So I assume that this highlighted vertex is going to be the root of the
tree that we are going to construct. At each iteration we are going to attach a new node to this tree,
and we would like to do this by using the lightest possible edge. So for this vertex, we have four edges
going out of this node. One of weight 4, one of weight 5, of weight 6, and of weight 8. In this case, we
select, of course, the edge of weight 4. So we attach it. Now our tree contains two nodes, and we
would like to attach a new node to this tree by lightest edge.
Play video starting at 3 minutes 18 seconds and follow transcript3:18
So in this case, this is the vertex in the middle. And it has weight, the corresponding edge, has weight
1 so we attach it. The next one has weight 2. The next one has weight 6. And finally, the last node is
attached by an edge of weight 1, right? So, in the next part of this lesson, we will present the
implementation details of both these algorithms, but we first will prove that both these algorithms
are correct. Namely, that they produce an optimal solution.

Cut Property
Mostly considerate to algorithms construct the solution iteratively. Namely they start with an empty
solution, with an empty set of edges. And at each iteration they expand the current set of edges by
one edge. But they use different strategies for selecting the next edge, namely the Kruskal's
Algorithm. Select the next lightest edge that doesn't produce a cycle. While the Prim's Algorithm
selects the next lightest edge that attaches a new vertex to the current tree. What we need to prove
to show that both of these algorithm's optimal is that this strategies are in some sense safe. Namely,
if at the current step we have a subset of edges which we call e prime which is a subset of some
optimal solution. Then by adding this edge according to one of these two strategies to the set gives us
also a set of address which is also a part of some optimal solution. This will justify at the end when we
have a three, this three is also a part of some optimal solution, which means that this is just optimal.
Play video starting at 1 minute 12 seconds and follow transcript1:12
So in this video, we're going to prove a lemma which is called cut property, which will justify that both
these strategies, I indeed save. The formal statement of the lemma is the following. Assume that we
have a graph G with a set of edges V and the set of edges E.
Play video starting at 1 minute 33 seconds and follow transcript1:33
Together with this graph, we are given a subset of its edges that we call X. X is a subset sum, subset of
the set of edges of our graph for which we knows that it is a part of some optimal solution. It is a part
of some minimum spanning tree. I assume also that the set of vertices is partitioned into two parts.
One part is the subset of vertices S, and another part all the remaining vertices V-S. So we're given
also a set of vertices S and this set of vertices S satisfies the following property. No edge from the set
X joins two vertices that such as one lies intercept S and the other one lies intercept V-S. No edge
from the set of X crosses between S and V-S. Finally, assume that E, the edge E of the initial graph
satisfies the following property. It is a lightest edge that joins the vertex from S with the vertex
outside of X with the vertex from V-S. E is a lightest edge across this partition.
Play video starting at 2 minutes 57 seconds and follow transcript2:57
Then what dilemma states is that if we add the edge E to our current set X, then what we get will also
be a subset of some minimum spanning tree. In other words, adding e to X in this case is a safe move.
So since this is a long statement, let me repeat it once again using a small example.
So I assume that this is our graph G, so we are given a graph G. Together with this graph G we are
given some subset of its edges shown here on the slide in blue, it is the set X.
Play video starting at 3 minutes 40 seconds and follow transcript3:40
So this is just subset of the set of edges of how we graph and assumes that we know that it is a part of
some minimum spanning tree. So the set X is a subset of some minimum spanning tree, then consider
some partition of the set of vertices into two parts.
See, this is one part and this is also remaining vertices, this is the second part. Then this partition is
required to satisfy the following property, no edge from x on our picture now blue edge joins to
vertices from different parts. In this case, it is satisfied. Indeed, any blue edge here joins two vertices
that lie in the same part.
Play video starting at 4 minutes 33 seconds and follow transcript4:33
Next, we consider the lightest edge in our initial graph that joins two vertices from different parts.
Assume that this edge is e, shown here in red. Then what lemma states is that if we this edge to our
set X, then the resulting set of edges will also be a part of some minimum spanning tree.
Play video starting at 5 minutes 3 seconds and follow transcript5:03
So this tells that adding e to our current subset which we grow repeatedly by one edge
Play video starting at 5 minutes 14 seconds and follow transcript5:14
to the subset is a safe move then I will proof the cut property.
This is our graph G and this is the subset of edges X shown here in blue and we assume that this
subset X is a part of some minimum spanning tree which we denote by T.
Play video starting at 5 minutes 35 seconds and follow transcript5:35
In other words, X is a subset of edges is a subset of some minimum spanning tree T. Now, we also
have a partition of the set of vertices into two part. We said the vertices S and the set of vertices V-S,
all that remains. E is a lightest edge in the initial graph which joins two vertices from different parts of
partition. So E joins the vertex from S with the vertex from V-S. What we need to prove is if we add
the edge E to our current set X, then what we get is also a part of some minimum spanning tree. Once
again, what we assume about X. We assume X is a part of some minimum spanning tree T,
Play video starting at 6 minutes 28 seconds and follow transcript6:28
and e joins two vertices from different parts of partition. And what we need to prove is that X with e
added 3 is also a part of some possibly different minimum spanning three. So the minimum spanning
tree which contains X plus e is not necessarily the same as T. So to prove this, we consider two cases.

PPT Slides
So first, if E happens to be a part of the minimum spanning tree T, then there is noting to prove. Once
again, we assume that X is a part of minimum spanning tree that we denote by if e is also part T sub x
plus e is a part of minimum spanning tree T. In this case, there is nothing to prove, we are just done.
Play video starting at 7 minutes 22 seconds and follow transcript7:22
So I assume that e is not a part of the minimum spanning tree T.
Play video starting at 7 minutes 28 seconds and follow transcript7:28
Let's then consider the wall tree T. So it contains X, so now the edges showing in blue show the wall
tree T.
Play video starting at 7 minutes 39 seconds and follow transcript7:39
Know that if we add the edge E to the tree T then it produces a cycle.
Play video starting at 7 minutes 46 seconds and follow transcript7:46
Because in the tree T, there is a path between the vertices, between the endpoints of e, and when we
add the edge e, it produces a cycle. In this case on our example, this is the cycle.
Play video starting at 8 minutes 3 seconds and follow transcript8:03
This is the cycle, I'm sorry. So this is a cycle and the edge e in the cycle joins to vertices from different
parts. Since this is a cycle, so the edge e joins for example the left part with the right part. Since this is
a cycle it must eventually at some point go from right part to left part. Which means that there must
be an edge in the tree T which also joins two parts. And we denote this edge by e prime and in our
case this is this edge, so this is e prime.
Play video starting at 8 minutes 47 seconds and follow transcript8:47
Now, I claim is that if we replace the edge e prime by the edge e in the current tree, is then what we
get is an optimal spanning tree. So why is that? First of all, the resulting tree is still connected,
because we just removed some edge, edge e prime from a cycle. So this is still connected, it was
connected before. And when we remove an edge from a cycle, it cannot disconnect the graph. At the
same time, the weight of the edge e prime
Play video starting at 9 minutes 24 seconds and follow transcript9:24
is at least the weight of the edge e, because e is the lightest edge that joins different part of partition.
Which proves that the resulting set of edges is a tree of minimum possible weight which in turns,
proves the cut property.

Kruskal's Algorithm

PPT Slides
We're now ready to present all the details of the Kruskal Algorithm. Namely, we will prove that the
Kruskal strategy is optimal, it produces a spanning tree of minimum total weight, and we will also
present implementation details of this algorithm. Recall that the idea of this algorithm is the
following. We start with an empty set X, and we repeatedly add to this set the next lightest edge that
doesn't produce a cycle. So it is not difficult to see that at any point of time the set of edges X forms a
forest. That is a collection of trees. Let me illustrate this. Assume that we have some set of vertices
and initially, the set X is empty, which means that each of our vertices forms a tree of size 1, namely a
tree that contains 1 vertex and no edges. Initially each vertex is isolated in our set X. Now, we start
adding edges. Probably, this is the first one, then we add this edge, then this edge, then this edge,
then this edge, for example. At this point of time, our set X consists of three trees. So this is the first
tree, T1. This is the second tree, T2. And this is the third tree, T3. In particular, the tree T3 contains
just one vertex. It is an isolated vertex, okay?
Play video starting at 1 minute 41 seconds and follow transcript1:41
Assume also that the next vertex that the next edge that Kruskal's Algorithm is going to add is the
following. [NOISE] So, it is the next lightest edge that doesn't produce a cycle. The first thing to note is
that the edge e must join two edges that belong to different trees, right? Because if they were in the
same tree, this would produce a cycle.
Play video starting at 2 minutes 9 seconds and follow transcript2:09
Okay, now we need to show that adding e is a safe move. For this, we need to use cut property, right?
And in turn for using cut property, we need to construct, we need to show a partition of the set of
vertices such that e is the lightest branch in our graph that joins vertices from different parts of
partition.
Play video starting at 2 minutes 33 seconds and follow transcript2:33
To construct such a cut, let's just take all the vertices from one of these trees
Play video starting at 2 minutes 38 seconds and follow transcript2:38
As one part of this cut namely this is going to be the set S so this is one part of our partition and all
other vertices is the other part of this partition. In this case, we see that e is the lightest edge that
joins two vertices from different parts. Which means in turn that cut property justifies that adding e in
this case is indeed a safe move. In other words, if our carbon set tax is a part of some optimal
spanning tree, then x with e added Is also a part of some minimum spanning tree.
Play video starting at 3 minutes 25 seconds and follow transcript3:25
Once again, initially, the set X in the Kruskal algorithm is empty, which means that each vertex of our
initial graph forms a separate connected component. So this is how initially the set x looks like. So
each vertex lies in a separate connective component. Then we start adding edges to x. This creates
[NOISE] a forest. In this forest currently, we have three trees. This is the first tree. This is the second
one. And this is the third one. Assume now that the next lightest edge that Kruskal's Algorithm
considers is the following one.
Play video starting at 4 minutes 17 seconds and follow transcript4:17
First of all, we need to be able to check whether it joins two vertices that lie in the same tree or in
other words, that lie in the same connected component. In this case, they lie in the same connected
component, so Kruskal's Algorithm will not edit through the set x, because otherwise, it would
produce a cycle in our set x. Now, assume that next set that Kruskal's Algorithm tries is the following.
Again, we need to check whether the corresponding two end points lie in the same connected
component. In this case, it is not a case. They lie in different connected component. So we add this
edge and to this point, we should update the data structures that we use to indicate that now we
actually merge trees T1 and T2. So what we need to check in our data structure is whether two given
vertices lie in the same set or in the same connected component, and also if they lie in different
connected components, we need to merge the corresponding two trees. So the perfect choice for
data structure for this algorithm is, of course, the disjoint sets data structure. Once again, to check
whether two given vertices lie in different connected components, we just check whether find of one
endpoint is equal to find of the other end point of this edge. If they are different then they lie in
different connected component. And when adding an edge to the set X, we need to merge the
corresponding two tree and this is done by calling the method union of the corresponding two end
points. We will now illustrate this on a toy example.
PPT Slides
This is a toy example where we have six vertices. Let's first call them A, B, C, D, E, F, and let's assume
that we have a data structure disjoint set and let me show the contents of this disjoint sets of this
data structure. So initially, each vertex lies in a separate set. No we start processing edges in order of
non-decreasing weight. So the first lightest edge is AE. We check whether A and E, at this point, lie a
different connected components. For this way, we call find of A and find of E. This gives us different
IDs because they stay in different sets. So we add this edge to our current solution and we also notify
our data structure that now A and E actually lie in the same connected component. So now it looks
like this. The next lightest edge if the edge CF. Again we ask our data structure whether C and F
belong to the same set and each replies that they do not belong to the same set because find of C is
not equal to find of F, so it is safe to add this edge to our solution. We also notify our data structures
and C and F now lie in the same set by calling union of C and F. So now C and F lie in the same set.
Play video starting at 7 minutes 56 seconds and follow transcript7:56
The next edge is A, E, D and we see that A and D lie in different connected components so we just add
this etch to a solution and also notify our data structures that we need to merge sets that contain the
vertex A and the vertex D. So now, we have three different disjoint sets in our data structure, which
actually corresponds to vertices of our three trees. So the first tree contains vertices AED, the second
one contains the vertex B, and the last one contains vertices C and F. Now, the next lightest edge is
DE, it has weight 3. However, we see that D and E belong to the same connected component. This, in
turn, means that if we added the edge DE to our current solutions, this would produce a cycle. So we
just keep the edge DE, and we continue to the next lightest edge. The next one is AB, and we see that
A and B lie in different connected components, so it is safe to add the edge AB to the current solution.
We also need to merge the corresponding two sets.
Play video starting at 9 minutes 20 seconds and follow transcript9:20
So after this merge, our sets look as follows. Now, the lightest edge is the edge BE, it is of weight five,
however, B and E belong to the same set, so we skip it. And the last edge that we actually add to our
solution is the edge BBF.
Play video starting at 9 minutes 44 seconds and follow transcript9:44
It is of weight 8 and, at this point, we also nudge two sets. And now, all our vertices lie in the same
connected component, which means that we constructed an optimal spanning tree, that is a spanning
tree of minimum total weight.

The pseudocode of the Kruskal algorithm looks as follows. First, for each vertex in our graph, we
create a separate disjoint set. We do this by calling MakeSet method of disjoint sets data structure.
Then we initialize the set of edges X by empty set. The next step is that we sort the edges, all the
edges of our graph, by weight. Then we process all the edges in order of non-decreasing weight. This
is done in this is fold. Then for reach such edge, we need to check whether adding in the x safe or not.
Namely, whether it produces a cycle or not. To do this, we just check whether u and v belong to
different connector components. For this, we need to check where to find a few equal to find a v or
not. If they're different, then they lie in different connected components. In this case, it is safe to add
the edge u, v to the set X and produces in this line and also in this case we need to notify our data
structure that all the vertices that before that lied in connected component with u and three, now lie
in the same connected components, because we just joined these two trees, and this is done by
calling union of of u and tree. Finally, in the end, we just return the resulting set X.
It remains to estimate the running time of the Kruskal's algorithm. We start by sorting the edges, so
this requires big O(E log E) time, right? This in turn can be rewritten as big O(E log of V squared), just
because a simple graph has at most V squared edges. This, in turn, can be rewritten as just E times 2
log V.
Play video starting at 12 minutes 8 seconds and follow transcript12:08
Again, log of V squared is equal to 2 log v, so we rewrite it as follows, and this is math analysis is just
big O of E log V.
Play video starting at 12 minutes 17 seconds and follow transcript12:17
So this is an upper bound on the running time of the first step. Then we need to process all the edges.
For this, we make two equals to find that region. Why two equals, well, just because we process all
the edges and follow each edge, we make two calls to define that region mainly, for one endpoint and
for the other endpoint.
Play video starting at 12 minutes 42 seconds and follow transcript12:42
Then we also make at most V minus one calls to the union procedure. Why V minus one? Well, just
because initially we have n connected components. Namely, when the set x is empty, each vertex of
our graph forms a separate connected components. Then each time when we call union, we actually
merge two connected components. And in the end of the run of our algorithm, we have just one
connected component. So all the vertices lie in the same tree. So initially, we have n connected
components and then we have 1 and each time we call union, we reduce the number of connected
components by 1 which means that we make exactly V minus 1 calls to union procedure. Okay, so we
have roughly E calls to find and roughly V calls to union procedure. Now recall that if we implement
the disjoint set data structure as a forest or as a collection of disjoint trees and we use union by rent.
Heuristic than the running time that abound on the running time of each iteration is just log v, which
gives us that amount E plus V times log v.
Play video starting at 13 minutes 57 seconds and follow transcript13:57
Recall also that in our case, the graph is connected, which mean that e is at least v minus 1, which in
turn means that E plus V, is at most 2E. Which allows us to rewrite it as follows. So the upper bound
on the running time of the second step is actually the same as for the first step. It is O of E log V,
which gives us that the upper bound on the running time of the whole algorithm is big O(E log V).
Now recall that we actually know that if, for our implementation of disjoint sets data structure, we
use both union by run heuristic and past compression heuristic then we can state a strong upper
bound. That is, instead of using log v here, we actually have log star of v, the iterated log, right? This
gives us a stronger upper bound for the second step. Unfortunately, this doesn't improve the total
running time because still the upper bound for sorting all the edges delineates the upper bound for
the second step. However, for some applications, the edges are given to us already in sorted order,
which means that, in this case, we do not need to spend E log V time for sorting the edges. So that the
total running time in this case becomes equal to E times log* of V. Which makes the Kruskal algorithm
in this case even more efficient. So in this case, the running time is upper bounded by E times log star
of V, which is nearly linear.

Prim's Algorithm

In this video, we can see the Prim's Algorithm. Recall that in this algorithm, the set X always forms a
tree, and we regularly grow it. At each iteration we attach a new vertex which is not currently in the
tree to the current tree by the lightest possible edge. And this is in fact very similar to Dijkstra's
algorithm. Recall that Dijkstra's algorithm finds the shortest paths between two nodes by constructing
the tree of shortest path from a given source node. We will now illustrate this on a toy example.
Consider the following toy example that we've already seen. We are now going to grow a tree which
will eventually become a minimum spanning tree. At each iteration we're going to select the vertex
such that it can be attached to the current tree by the lightest possible edge. To select such a vertex
we will use the priority queue data structure. So initially, we know nothing about the graph, so the
cost of attaching each vertex to the current tree is just equal to infinity. So initially the priority was a
cost of attaching each vertex to the current tree is equal to infinity.
Play video starting at 1 minute 25 seconds and follow transcript1:25
That is, we show here the priorities of all other vertices in the priority queue data structures that
we're going to use. Then we declare some vertex, just randomly picked vertex, as a root of the tree
that we are going to grow. Assume that we consider this vertex as a root. Now we can change its
priority to be equal to 0. It costs us nothing to attach this vertex to the current tree. It is the root. Now
we need to process all address going out of this tree. Namely, we change the priority of this vertex to
1 because we see that there is an edge of cost 1 that attaches this vertex to our current tree. And for
this vertex we change its priority to 8. Okay, now we have five vertices in our priority cube, all of the
vertices except for the root.
PPT Slides
And the cost of one of them is equal to 1 because of priority. The priority of the second one is equal
to 8, and the priority of the remaining three vertices are equal to plus infinity. So we select the vertex
with the smallest priority, this is 1. So we attach it to our current tree. So our current tree looks as
follows. Now we need to process all the address that grow out of this tree. Namely, when we add a
new vertex, we need to check whether new vertices can be attached to this vertex.
Play video starting at 3 minutes 10 seconds and follow transcript3:10
When we see this, first we see that there is a match of weight 6 so we need to change the priority of
this vertex to 6 because it now can be attached to the current tree by the edge of weight 6.
Play video starting at 3 minutes 26 seconds and follow transcript3:26
Also we change the priority of this vertex to 9. Okay, now we have four vertices in our priority queue
data structure. And we select the vertex with the minimum priority, which is the vertex with priority
6. So now this vertex is also in our tree which is joined by this edge to our current tree. Now we need
to process all the edges going out of this vertex to the vertices that are not currently in our tree.
Play video starting at 4 minutes 4 seconds and follow transcript4:04
When processing the edges, we change the priority of this vertex to 4.
Play video starting at 4 minutes 11 seconds and follow transcript4:11
And we change the priority of this vertex to 5. Namely, we just found a cheaper way of connecting
this vertex to our current tree. Okay, now we have three vertices in our priority queue
Play video starting at 4 minutes 29 seconds and follow transcript4:29
so we extract the minimum value. This gives us the following vertex, so we include it in our tree,
which gives us, and attach it by this node.
Then we need to process all the edges going out of this vertex to vertices that are currently not in the
tree.
Play video starting at 4 minutes 48 seconds and follow transcript4:48
We see that we need to change the priority of this vertex to 1 because there is a vertex of, because
there is a match of weight 1 that connects this vertex to, there's a vertex which is currently in the
tree. And we need to change this priority to 2, right?
Play video starting at 5 minutes 10 seconds and follow transcript5:10
Okay, great, now we have two vertices in our priority queue, one of priority 1, and one of priority 2.
So we select this vertex. Now it is in our tree, and it is connected by this edge, right?
Play video starting at 5 minutes 25 seconds and follow transcript5:25
When processing all the edges going out of this vertex, we see that there is a match of weight 3.
However, it doesn't decrease the course of attaching the remaining vertex to the tree. So we keep the
perimeter of this vertex unchanged. So in the last step, we extract the only remaining vertex which is
this one. And we see that the cost of attaching this vertex to the current tree is equal to 2, and this is
done by this edge.
Play video starting at 6 minutes 2 seconds and follow transcript6:02
So, now, let's compute the total weight of the resulting tree. It is 2 plus 1 plus 4, which gives us 7. Plus
6, which is 13. Plus 1, so this gives us the tree of total weight 14.

We now provide the full pseudocode of the Prim's algorithm. As we've discussed before, it is very
similar to Dijkstra's algorithm.
Play video starting at 6 minutes 28 seconds and follow transcript6:28
So, once again the Prim's Algorithm gradually grows the tree which eventually turns into a minimum
spanning tree. We used the priority queue data structure to keep for each vertex the minimum cost
of attaching it to the current tree. At each iteration we use priority queue to quickly find the vertex
which can be attached to the current vertex by the lightest edge. So specifically we do the following.
Initially we have for each vertex we do not know the cost of attaching it to the current tree so we just
put infinity into the array cost. We'll also going to keep array parent which we'll actually define as a
tree, okay. Initially we do not, for each vertex of our graph, we do not know its parent. Then we select
any vertex of our graph, u0. And we declare this vertex as the root of the tree which we are going to
grow.
Play video starting at 7 minutes 39 seconds and follow transcript7:39
We then update cost[u0] to be equal to 0. So, we say that attaching this node, this vertex to the
current tree is, the cost of attaching this node is equal to 0 because it is already in our tree. We then
create a priority queue. So at this point and we use costs as priorities. At this point all the vertices lie
in priority queue. And the priorities are all of them except for u0 are equal to plus infinity, while the
priority of u0 is equal to 0, okay? Then we do the following. At each iteration, we extract the vertex
with the minimal priority out of our priority queue. This means that we're actually attaching this
vertex to the current tree. When we do this, we also need to go through all the neighbors of the
vertex V and to check whether the edge that leads from V to this neighbor of V, it has actually cost
less than the current cost, or than the current priority of this neighbor. Specifically we do the
following, when we add the vertex V to our current tree, we iterate through all its neighbors. This is
not in this for loop. So we iterate through all neighbors z of the vertex v. And we check if z is still in
the priority queue, namely, if z is not in our current tree, and if the current cost of z is greater than
the weight of the edge from v to z.
Play video starting at 9 minutes 29 seconds and follow transcript9:29
Then, this means that we've just found a cheaper way to attach the vertex z to the current tree, right?
And in this case, we just update the value of cost at z. So we assign cost at z to be equal to the weight
of the edge v z. And we also say that in this case if we attach the vertex v to the current tree, then this
will be done through the edge v, z, which means that v is going to be the parent of z. Finally, we notify
our priority queues that we need to change the priority of the vertex z to the updated value cost of z.
Play video starting at 10 minutes 14 seconds and follow transcript10:14
Since the Prim's Algorithm is very similar to Dijkstra's Algorithm, its running time is also similar. So
essentially, what we are doing here is the following. We make V calls to ExtractMin procedure, right,
and also we make at most E calls to ChangePriority method, right?
Play video starting at 10 minutes 36 seconds and follow transcript10:36
And the total running time depends on the implementation of the priority queue data structure. If we
just use array to store all priorities and at each iteration, this means that change in priority is very
cheap, because we just change the corresponding value in our array. However, for finding the
minimum value, we need to scan the whole array. In this case, we get V squared upper bound on the
running time. Why is that? Well once again, because ExtractMin in this case has running time big O of
V.
Play video starting at 11 minutes 14 seconds and follow transcript11:14
So this is V times big O of V, while ChangePriority has running time big O of 1, right. It is a constant
time operation. Since the numbers of edges in the graph is at most V squared, this gives us big O of V
squared. If on the other hand we implement our priority queue data structure as a binary heap, then
both these operations have running time big O of log V. In this case, we have the following running
time, V + E times log V. And since, in our case, again the graph is connected, E is at least 3 minus 1. So
this allows us rewrite the whole expression as big O of E times log V. So once again the running time
depends on implementation. If we use array to implement our priority queue, this gives us V squared.
If we use binary heap, this gives us E log V. Which one is better depends on whether our graph is
dense or not.
We are now ready to summarize. In this module we considered two greedy algorithms for the
minimum spanning tree problem.
Play video starting at 12 minutes 35 seconds and follow transcript12:35
And it uses the following idea. At each iteration we add the next lightest edge to our current solution
if this edge doesn't produce a cycle. To check whether the current edge produces a cycle or not, we
use disjoint sets data structure. Namely, for the current edge uv we just check whether u and v belong
to the same connected component.
Play video starting at 12 minutes 57 seconds and follow transcript12:57
If they do, we just skip the current edge. The next algorithm is due to Prim. It uses a slightly different
but still greedy strategy. Namely, we greedily grow a tree. At each iteration we attach a new vertex to
the current tree by a lightest possible edge. To find such a vertex quickly we use the priority queue
data structure.

Slides and External References

Slides
11_1_minimum_spanning_trees.pdfPDF File

Reading
Section 5.1 in [DPV]
Visualizations
Kruskal's algorithm by David Galles

References
[DPV] Sanjoy Dasgupta, Christos Papadimitriou, and Umesh Vazirani. Algorithms (1st Edition).
McGraw-Hill Higher Education. 2008.

Programming Assignment: Programming Assignment 5:


Minimum Spanning Trees
You have not submitted. You must earn 1/2 points to pass.

Deadlin Pass this assignment by Aug 9, 11:59 PM PDT


e

1. Instructions
2. My submission
3. Discussions
Welcome to the third (and the last one) programming assignment of the Algorithms on Graphs
class! In this programming assignment you will be practicing implementing algorithms computing
minimum spanning trees.

(The instructions and starter files can be found in the first week programming assignment archive
file.)

How to submit
When you're ready to submit, you can upload files for each part of the assignment on the "My
submission" tab.
Week 6
Algorithms on Graphs
Week 6
Discuss this week's modules here.

45 threads · Last post 4 days ago


Go to forum

Advanced Shortest Paths Project (Optional)

In this module, you will learn Advanced Shortest Paths algorithms that work in practice 1000s (up
to 25000) of times faster than the classical Dijkstra's algorithm on real-world road networks and
social networks graphs. You will work on a Programming Project based on these algorithms. You
will find the shortest paths on the real maps of parts of US and the shortest paths connecting
people in the social networks. We encourage you not only to use the ideas from this module's
lectures in your implementations, but also to come up with your own ideas for speeding up the
algorithm! We encourage you to compete on the forums to see whose implementation is the
fastest one :)
Less
Key Concepts
 Develop an algorithm to find distances in the graphs of social networks such as Facebook
and internet graphs much faster than with the classical approaches
 Develop an algorithm to find distances in the real road networks faster
 Develop Bidirectional Dijkstra, A* (A-star) and Contraction Hierarchies algorithms
 Develop a solution of the central problem of delivery companies - delivery truck route
optimization on real-world road network
 Develop an algorithm to find distances in the real-world road networks thousands of times
faster than with the classical approaches

Less

Bidirectional Dijkstra

Video: LectureProgramming Project: Introduction

1 min

Resume

. Click to resume

Video: LectureBidirectional Search

10 min

Video: LectureSix Handshakes

6 min

Video: LectureBidirectional Dijkstra

5 min

Video: LectureFinding Shortest Path after Meeting in the Middle


9 min

Video: LectureComputing the Distance

2 min

Reading: Slides and External References

10 min

A-star Algorithm (A*)

Video: LectureA* Algorithm

11 min

Video: LecturePerformance of A*

2 min

Video: LectureBidirectional A*

6 min

Video: LecturePotential Functions and Lower Bounds

5 min

Video: LectureLandmarks (Optional)

10 min

Reading: Slides and External References

10 min

Contraction Hierarchies

Video: LectureHighway Hierarchies and Node Importance

7 min

Video: LecturePreprocessing

7 min

Video: LectureWitness Search

10 min

Video: LectureQuery

8 min

Video: LectureProof of Correctness

9 min

Video: LectureNode Ordering

14 min

Reading: Slides and External Refernces

10 min

Practice Quiz: Bidirectional Dijkstra, A* and Contraction Hierarchies


10 questions

Programming Project

Practice Programming Assignment: Advanced Shortest Paths

30h

You might also like