Assignment 2.1

Uploaded by

ARJU Zerin

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

46 views2 pages

Assignment 2.1

Uploaded by

ARJU Zerin

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 2

Assignment 2.

1 Due Date 21/10/2021

Hierarchical clustering:

1. We have 2-dimensional iris data for 20 flowers here. So we have 20 2-dimensional points.
a. Use hierarchical clustering for 3 clusters. SO you will stop when there are clusters
formed. (hand calculation)
b. Draw the tree with pen and paper how you combine the points to create three clusters.
(example: fig 7.6, page 262, MMDS book)
2. Write a small python program to do the Q1

Kmeans clustering

Download dataset from here. Use data0.txt for your data file.

Data Description:

1. Each line contains one data point.

2. First column indicates the point index. Point index are not used in calculation. It is for
identification only.
3. After the index, 50 features (50 dimensional data) are provided with comma separated values in
each line.

Task: TODO

1. You have to implement k means clustering algorithm with these data

2. A sample implementation file is provided. Please download. You can use notebook or python
style to complete the TODO. The implementation file is actually a skeleton. You need to fill up by
your code blocks where asked.
3. At first, data are loaded
4. A fixed number of points are sampled randomly from the above data.
a. Inside the kmeans function, use initialize_centroids_simple() to initialize your centroids.
This is the simple assignment function you need to implement. Randomly select K points
from the sampled data and assign them as initialized centroids.
b. Then, in the kmeans function, you have to write your own code to count the number of
points assigned for each cluster and store in the defined structure.
c. Then write your own code to terminate the process based on the termination criteria
discussed in the class. We evaluate the quality of the clustering using the clustering
objective

Where N is the total number of sampled points. xi is the ith data point. zk is the centroid
for kth cluster. The algorithm is terminated when J is nearly equal in two successive
iterations (e.g., we terminate when |J − Jprev| ≤ 10−5J, where Jprev is the value of J after
the previous iteration).

d. In the main function, finally save you outputs.

Assignment 2.1 Due Date 21/10/2021

i. Write all final centroids in out1 file. One line for each centroids, Features would
be separated by comma.
ii. Write the cluster assignments for each of the point. Each line: Point index,
cluster number
5. Now use initialize_centroids() to initialize the centroids in the following way.
a. Calculate max feature and min feature value for each dimension
b. Use diff = max - min for each dimension
c. For each centroid j, in each dimension i; assign centroids[j][i] = min_feature_val + diff *
random.uniform(1e-5, 1)

6. Then do the same thing as 4b, 4c, and 4d.

7. Compare the outputs for 4d and 5.

Unit IV
No ratings yet
Unit IV
51 pages
Machine Learning Notes Anna University
100% (1)
Machine Learning Notes Anna University
14 pages
K Means Clustering
No ratings yet
K Means Clustering
11 pages
Digital Business MCQ ANS
100% (1)
Digital Business MCQ ANS
21 pages
Extra Questions
100% (1)
Extra Questions
7 pages
Dmaclat4 Merged
No ratings yet
Dmaclat4 Merged
46 pages
K Means
No ratings yet
K Means
3 pages
Week 10
No ratings yet
Week 10
84 pages
Business Report Data Mining
91% (11)
Business Report Data Mining
18 pages
Practice Exercises On Mass Balance
No ratings yet
Practice Exercises On Mass Balance
3 pages
ML-Notes - 4 and 5 - 16 Marks
No ratings yet
ML-Notes - 4 and 5 - 16 Marks
21 pages
MLFILE
No ratings yet
MLFILE
21 pages
Data Mining Business Report 2
No ratings yet
Data Mining Business Report 2
18 pages
Lecture 14 Clustering
0% (1)
Lecture 14 Clustering
57 pages
Aiml Unit 3 4
No ratings yet
Aiml Unit 3 4
19 pages
DWDM Lab All
No ratings yet
DWDM Lab All
20 pages
Building K-Means Clustering Algorithm From Scratch
No ratings yet
Building K-Means Clustering Algorithm From Scratch
10 pages
Un Supervised Learning
No ratings yet
Un Supervised Learning
22 pages
AdityaGaur BDA Exp8
No ratings yet
AdityaGaur BDA Exp8
4 pages
Kmeans++ Exercise
No ratings yet
Kmeans++ Exercise
6 pages
Clustering Revision
No ratings yet
Clustering Revision
6 pages
Assignment 6-Fall 2024
No ratings yet
Assignment 6-Fall 2024
5 pages
Project On Data Mining: Prepared by Ashish Pavan Kumar K PGP-DSBA at Great Learning
No ratings yet
Project On Data Mining: Prepared by Ashish Pavan Kumar K PGP-DSBA at Great Learning
50 pages
Latent Class Análysis
No ratings yet
Latent Class Análysis
33 pages
Bil570 hw3 Summer2020
No ratings yet
Bil570 hw3 Summer2020
3 pages
Assignment # 1: Performance Timeline of Flynn Taxonomy
No ratings yet
Assignment # 1: Performance Timeline of Flynn Taxonomy
21 pages
K Means Example
No ratings yet
K Means Example
8 pages
Clustering Assignment
No ratings yet
Clustering Assignment
3 pages
ML Exp5 C36
No ratings yet
ML Exp5 C36
18 pages
Experiment 4 1
No ratings yet
Experiment 4 1
4 pages
AIML Lab 10
No ratings yet
AIML Lab 10
4 pages
Imkpğ
No ratings yet
Imkpğ
3 pages
Page Rank
No ratings yet
Page Rank
7 pages
Data Mining Project - Clustering - State Wise Health Income
No ratings yet
Data Mining Project - Clustering - State Wise Health Income
9 pages
Unit 3
No ratings yet
Unit 3
12 pages
Data Mining Assignment No. 1
No ratings yet
Data Mining Assignment No. 1
22 pages
Assignment 3 FML July Nov 2024
No ratings yet
Assignment 3 FML July Nov 2024
2 pages
Data Science Exercise Hard
No ratings yet
Data Science Exercise Hard
12 pages
637227449508725497DataMining (Chapter8)
No ratings yet
637227449508725497DataMining (Chapter8)
8 pages
9536 DWM Expt 7 Merged
No ratings yet
9536 DWM Expt 7 Merged
14 pages
Assignment 3
No ratings yet
Assignment 3
2 pages
23CC554
No ratings yet
23CC554
10 pages
Exp 8
No ratings yet
Exp 8
5 pages
Quick Reference Guide For RTN510 AP Deployment (Web)
No ratings yet
Quick Reference Guide For RTN510 AP Deployment (Web)
7 pages
STAT452 Project1
No ratings yet
STAT452 Project1
13 pages
Ass6 (DMDS)
No ratings yet
Ass6 (DMDS)
7 pages
Sabre5000 Tri Rev A Low Res
No ratings yet
Sabre5000 Tri Rev A Low Res
144 pages
MS6711 Data Mining Homework 1: 1.1 Implement K-Means Manually (8 PTS)
No ratings yet
MS6711 Data Mining Homework 1: 1.1 Implement K-Means Manually (8 PTS)
6 pages
D3 Docs
No ratings yet
D3 Docs
6 pages
GV300CAN @track Air Interface Protocol: GSM/GPRS/GNSS Tracker
No ratings yet
GV300CAN @track Air Interface Protocol: GSM/GPRS/GNSS Tracker
444 pages
ITIS5431 2022 Course
No ratings yet
ITIS5431 2022 Course
7 pages
CS60050 - Machine Learning - Programming Assignment - 3
No ratings yet
CS60050 - Machine Learning - Programming Assignment - 3
5 pages
Cluster Analysis Chapter 8 Solution
No ratings yet
Cluster Analysis Chapter 8 Solution
8 pages
DS - ML - 7 - 60019210046 1
No ratings yet
DS - ML - 7 - 60019210046 1
6 pages
ML Minors Exp7
No ratings yet
ML Minors Exp7
6 pages
Artificial Intelligence Lab 10
No ratings yet
Artificial Intelligence Lab 10
8 pages
Clustering MMD
No ratings yet
Clustering MMD
1 page
Tutorial Series 4: Exercise 1
No ratings yet
Tutorial Series 4: Exercise 1
1 page
Drawback of Standard K-Means Algorithm
No ratings yet
Drawback of Standard K-Means Algorithm
5 pages
Day12 Hierarchical Clustering
No ratings yet
Day12 Hierarchical Clustering
9 pages
Pratibha Sikheriya (Data Mining)
No ratings yet
Pratibha Sikheriya (Data Mining)
4 pages
Unsupervisd Learning Algorithm
No ratings yet
Unsupervisd Learning Algorithm
6 pages
Advanced C Concepts and Programming: First Edition
From Everand
Advanced C Concepts and Programming: First Edition
Gayatri
3/5 (1)
Assignment On Clustering
No ratings yet
Assignment On Clustering
2 pages
01 MK1033C 1
No ratings yet
01 MK1033C 1
104 pages
CAT REMAN Long Block CM20200519-5fc05-607d7
No ratings yet
CAT REMAN Long Block CM20200519-5fc05-607d7
2 pages
Sinamics Startdrive v13 Sp1
No ratings yet
Sinamics Startdrive v13 Sp1
1 page
Amit Mishra - Quality
No ratings yet
Amit Mishra - Quality
4 pages
Mil 12 Activity Sheets First Quarter
No ratings yet
Mil 12 Activity Sheets First Quarter
12 pages
K Means Algorithm
No ratings yet
K Means Algorithm
6 pages
ServiceManuals LG Monitor L1510BF L1510BF - Service Manual
No ratings yet
ServiceManuals LG Monitor L1510BF L1510BF - Service Manual
30 pages
The Future Uf Natural Gas.: ND Geopol 1 IGS
No ratings yet
The Future Uf Natural Gas.: ND Geopol 1 IGS
244 pages
Project - Data Mining: Bank - Marketing - Part1 - Data - CSV
No ratings yet
Project - Data Mining: Bank - Marketing - Part1 - Data - CSV
4 pages
COMP 4710 Assignment 1 - Clustering Total Marks
No ratings yet
COMP 4710 Assignment 1 - Clustering Total Marks
2 pages
Password Plus Handbook 2024 05 14
No ratings yet
Password Plus Handbook 2024 05 14
17 pages
Blaze30 User Manual
No ratings yet
Blaze30 User Manual
11 pages
CSE - 323 - Computer - and - Cyber - Security - Fall - 2024 (Outline)
No ratings yet
CSE - 323 - Computer - and - Cyber - Security - Fall - 2024 (Outline)
8 pages
AKTU WALA OS Unit-5-Disk Scheduling File Handling
No ratings yet
AKTU WALA OS Unit-5-Disk Scheduling File Handling
9 pages
Csun 330 72P 40
No ratings yet
Csun 330 72P 40
2 pages
Ficha Tecnica de Tanques Polyglass
No ratings yet
Ficha Tecnica de Tanques Polyglass
2 pages
Brochure Vgstudio Max 33 en
No ratings yet
Brochure Vgstudio Max 33 en
32 pages
MN3166 Contemporary Issues in Entrepreneurship - Juicero
No ratings yet
MN3166 Contemporary Issues in Entrepreneurship - Juicero
16 pages
Digital Economy and Society Statistics - Enterprises
No ratings yet
Digital Economy and Society Statistics - Enterprises
19 pages
Carga Térmica Led
No ratings yet
Carga Térmica Led
6 pages
Me Yrk
No ratings yet
Me Yrk
3 pages
DAC1 27 03 Rev 0
No ratings yet
DAC1 27 03 Rev 0
2 pages
MySQL Roles and Users Management 1734785003
No ratings yet
MySQL Roles and Users Management 1734785003
10 pages
HLK v20
No ratings yet
HLK v20
13 pages
CodeCademy Ruby On Rails
No ratings yet
CodeCademy Ruby On Rails
6 pages
Toefl Partnership Program
No ratings yet
Toefl Partnership Program
1 page

Assignment 2.1

Uploaded by

Assignment 2.1

Uploaded by

Assignment 2.

1 Due Date 21/10/2021

1. Each line contains one data point.

1. You have to implement k means clustering algorithm with these data

d. In the main function, finally save you outputs.

6. Then do the same thing as 4b, 4c, and 4d.

You might also like