Machine Learning Week 8 Homework
Machine Learning Week 8 Homework
INT3401E 57
Solution
Step 1: Initial Setup
The initial cluster centers are:
Cluster 1: C1 = A1 = (2, 10)
Cluster 2: C2 = A4 = (5, 8)
Cluster 3: C3 = A7 = (1, 2)
1
2. Point A2 = (2, 5)
p √
d(A2 , C1 ) = (2 − 2)2 + (5 − 10)2 = 25 = 5
p √ √
d(A2 , C2 ) = (5 − 2)2 + (8 − 5)2 = 9 + 9 = 18
p √ √
d(A2 , C3 ) = (1 − 2)2 + (2 − 5)2 = 1 + 9 = 10
Assign A2 to Cluster 3
3. Point A3 = (8, 4)
p √ √
d(A3 , C1 ) = (8 − 2)2 + (4 − 10)2 = 36 + 36 = 72
p √ √
d(A3 , C2 ) = (5 − 8)2 + (8 − 4)2 = 9 + 16 = 25 = 5
p √ √
d(A3 , C3 ) = (1 − 8)2 + (2 − 4)2 = 49 + 4 = 53
Assign A3 to Cluster 2
4. Point A4 = (5, 8)
p √ √
d(A4 , C1 ) = (5 − 2)2 + (8 − 10)2 = 9 + 4 = 13
5. Point A5 = (7, 5)
p √ √
d(A5 , C1 ) = (7 − 2)2 + (5 − 10)2 = 25 + 25 = 50
p √ √
d(A5 , C2 ) = (5 − 7)2 + (8 − 5)2 = 4 + 9 = 13
p √ √
d(A5 , C3 ) = (1 − 7)2 + (2 − 5)2 = 36 + 9 = 45
Assign A5 to Cluster 2
6. Point A6 = (6, 4)
p √ √
d(A6 , C1 ) = (6 − 2)2 + (4 − 10)2 = 16 + 36 = 52
p √ √
d(A6 , C2 ) = (5 − 6)2 + (8 − 4)2 = 1 + 16 = 17
p √ √
d(A6 , C3 ) = (1 − 6)2 + (2 − 4)2 = 25 + 4 = 29
Assign A6 to Cluster 2
2
7. Point A7 = (1, 2)
p √ √
d(A7 , C1 ) = (1 − 2)2 + (2 − 10)2 = 1 + 64 = 65
p √ √
d(A7 , C2 ) = (5 − 1)2 + (8 − 2)2 = 16 + 36 = 52
d(A7 , C3 ) = 0 (since A7 is the initial center of C3 )
Assign A7 to Cluster 3
8. Point A8 = (4, 9)
p √ √
d(A8 , C1 ) = (4 − 2)2 + (9 − 10)2 = 4 + 1 = 5
p √ √
d(A8 , C2 ) = (5 − 4)2 + (8 − 9)2 = 1 + 1 = 2
p √ √
d(A8 , C3 ) = (1 − 4)2 + (2 − 9)2 = 9 + 49 = 58
Assign A8 to Cluster 2
C1 = A1 = (2, 10)
Conclusion
After one epoch, the clusters and their new centers are:
• Cluster 1: Points {A1 }, New Center: (2, 10)
• Cluster 2: Points {A3 , A4 , A5 , A6 , A8 }, New Center: (6, 6)
• Cluster 3: Points {A2 , A7 }, New Center: (1.5, 3.5)
3
Exercise 2 [DBScan].
If ϵ = 2 and min sample is 2, what are the clusters that DBScan√ would discover
with the 8 data points in Exercise 1 ? What if ϵ is increased to 10?
Solution
Step 1: Determine Core Points with ϵ = 2
A point is considered a core point if it has at least min samples points (in-
cluding itself) within a distance of ϵ = 2. We calculate the distances between
each pair of points to determine the core points.
Distance Matrix
We first compute the distance between each pair of points using the Euclidean
distance formula: p
d = (x2 − x1 )2 + (y2 − y1 )2
The distance matrix for the points is as follows (rounded to two decimal places):
A1 A2 A3 A4 A5 A6 A7 A8
A1 0 5 7.21 3.61 6.71 7.21 8.25 2.24
A2 5 0 6.32 5 5.39 5.10 3.16 4.47
A3 7.21 6.32 0 4.24 1.41 2 7.07 5
A4 3.61 5 4.24 0 3.61 4.12 6.71 1.41
A5 6.71 5.39 1.41 3.61 0 1 6.08 4.47
A6 7.21 5.10 2 4.12 1 0 5.39 5
A7 8.25 3.16 7.07 6.71 6.08 5.39 0 7.28
A8 2.24 4.47 5 1.41 4.47 5 7.28 0
4
2. Remaining points A1 , A2 , A4 , A7 , and A8 do not have enough neighbors
within ϵ = 2 to form a cluster. These points are labeled as noise.
Result for ϵ = 2
With ϵ = 2 and min samples = 2: - We have one cluster: {A3 , A5 , A6 } - The
points A1 , A2 , A4 , A7 , and A8 are noise.
√
Step 4: Increase ϵ to 10
√
Now, we increase ϵ to 10 ≈ 3.16 and re-evaluate.
√
Identify Core Points with ϵ = 10
√
With ϵ = 10 and min samples = 2, we identify the core points again.
- A1 : Has A4 and A8 within ϵ. - A3 : Has A5 and A6 within ϵ. - A4 : Has A1
and A8 within ϵ. - A5 : Has A3 and A6 within ϵ. - A6 : Has A3 and A5 within
ϵ. - A8 : Has A1 and A4 within ϵ.
Thus, the core points are A1 , A3 , A4 , A5 , A6 , and A8 .
√
Form Clusters with ϵ = 10
Using the core points, we form clusters as follows:
1. Start with core point A1 : - A4 and A8 are within distance ϵ. - Form a
cluster: {A1 , A4 , A8 }.
2. Start with core point A3 : - A5 and A6 are within distance ϵ. - Form
another cluster: {A3 , A5 , A6 }.
3. Points A2 and A7 are not connected to any cluster and remain as noise.
√
Result for ϵ = 10
√
With ϵ = 10 and min samples = 2: - We have two clusters: - Cluster 1:
{A1 , A4 , A8 } - Cluster 2: {A3 , A5 , A6 } - The points A2 and A7 are noise.