0% found this document useful (0 votes)

15 views

original ML lab manual (1)

The document outlines the structure and objectives of a Machine Learning Lab course, detailing its internal and external marks. It lists various experiments focused on implementing machine learning algorithms using Java/Python, including the FIND-S algorithm, Candidate-Elimination algorithm, decision trees, and neural networks. The course aims to equip students with practical skills in machine learning software and problem-solving techniques.

Uploaded by

23kq5a1204

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

15 views

original ML lab manual (1)

Uploaded by

23kq5a1204

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 22

Course Course

Code Course Name Structure

L T P C
P21XXXXX Machine Learning Lab
0 0 3 1.5

Internal Marks: 15 External Marks: 35

Course Objectives:

1. To introduce students to the basic concepts and techniques of Machine Learn- ing.

2. To develop skills of using recent machine learning software for solving practical
problems.

3. To gain experience of doing independent study and research.

Course Outcomes:

1. Design java/python programs for various learning algorithms.

2. Apply appropriate data sets to the machine learning algorithms.
3. Identify and apply machine Learning algorithms to solve real world problems.

List of Experiments:

1. Implement and demonstrate the FIND-S algorithm for finding the most specific
hypothesis based on a given set of training data samples. Read the training data
from a .CSV file.

2. For a given set of training data examples stored in a .CSV file, implement and
demonstrate the Candidate-Elimination algorithm to output a description of the set
of all hypotheses consistent with the training examples.

3. Write a program to demonstrate the working of the decision tree based ID3 algorithm.
Use a n appropriate data set for b u i l d i n g the decision tree and apply this knowledge
to classify a new sample .

4. Build a prediction model to perform logistic regression

5. Build an Artificial Neural Network by i m p l e m e n t i n g the Back propagation

algorithm and test the same using appropriate data sets.

6. Write a program to implement the naı̈ve Bayesian classifier for a sample training
data set stored as a .CSV file.

7. Assuming a set of documents that need to be classified, use the naı̈ve Bayesian
Classifier model to perform this task. Built-in Java classes/API can be used to write
the program. Calculate the accuracy, precision, and recall for your data set.
8. Write a program to construct a Bayesian network considering medical data.
Use this model to demonstrate the diagnosis of heart patients using standard
Heart Disease Data Set. You can use Java/Python ML library classes/API.

9. Perform clustering using k-means clustering algorithm.

10. Write a program to implement k-Nearest Neighbour algorithm to classify the iris data set. Print
both correct and wrong predictions.

11. Implement the non-parametric Locally Weighted Regression algorithm in order to fit data
points. Select appropriate data set for your experiment and draw graphs.

Excercise-1: Implement and demonstrate the FIND-S algorithm for finding the most
specific hypothesis based on a given set of training data samples. Read the training data
from a .CSV file.
Program Code:
Import csv
With open ('tennis.csv', 'r') as f: reader = csv.reader(f) your_list = list(reader)
h = [['0', '0', '0', '0', '0', '0']]
for i in your_list: print(i)
if i[-1] == "True":
j=0
for x in i:
if x != "True":
if x != h[0][j] and h[0][j] == '0':
h[0][j] = x
elif x != h[0][j] and h[0][j] != '0':
h[0][j] = '?'
else:
pass
j=j+1
print("Most specific hypothesis is")
print(h)

Output

'Sunny', 'Warm', 'Normal', 'Strong', 'Warm', 'Same',True

'Sunny', 'Warm', 'High', 'Strong', 'Warm', 'Same',True
'Rainy', 'Cold', 'High', 'Strong', 'Warm', 'Change',False
'Sunny', 'Warm', 'High', 'Strong', 'Cool','Change',True

Maximally Specific set

[['Sunny', 'Warm', '?', 'Strong', '?', '?']]
Excercise-2:.For a given set of training data examples stored in a .CSV file, implement
and demonstrate the Candidate-Elimination algorithm to output a description of the set
of all hypotheses consistent with the training examples.

Program Code:

class Holder:
factors={} #Initialize an empty dictionary
attributes = () #declaration of dictionaries parameters with an arbitrary length

'''
Constructor of class Holder holding two parameters, self refers to the instance of the class
'''
def init (self,attr): # self.attributes = attr for i in attr:
self.factors[i]=[]

def add_values(self,factor,values): self.factors[factor]=values

class CandidateElimination:
Positive={} #Initialize positive empty dictionary Negative={} #Initialize negative empty dictionary

def init (self,data,fact):

self.num_factors = len(data[0][0]) self.factors = fact.factors
self.attr = fact.attributes self.dataset = data

def run_algorithm(self): '''

Initialize the specific and general boundaries, and loop the dataset against the algorithm
'''
G = self.initializeG() S = self.initializeS()

'''
Programmatically populate list in the iterating variable trial_set '''
count=0
for trial_set in self.dataset:
if self.is_positive(trial_set):
#if trial set/example consists of positive examples
G = self.remove_inconsistent_G(G,trial_set[0]) #remove inconsitent data from the general boundary
S_new = S[:] #initialize the dictionary with no key-value pair print (S_new)
for s in S:
if not self.consistent(s,trial_set[0]): S_new.remove(s)
generalization = self.generalize_inconsistent_S(s,trial_set[0]) generalization =
self.get_general(generalization,G)
if generalization: S_new.append(generalization)
S = S_new[:]
S = self.remove_more_general(S) print(S)
else:#if it is negative

S = self.remove_inconsistent_S(S,trial_set[0]) #remove inconsitent data from the specific boundary

G_new = G[:] #initialize the dictionary with no key-value pair (dataset can take any value)
print (G_new) for g in G:
if self.consistent(g,trial_set[0]): G_new.remove(g)
specializations = self.specialize_inconsistent_G(g,trial_set[0]) specializationss =
self.get_specific(specializations,S)
if specializations != []: G_new += specializationss
G = G_new[:]
G = self.remove_more_specific(G) print(G)

print (S) print (G)

def initializeS(self):
''' Initialize the specific boundary '''
S = tuple(['-' for factor in range(self.num_factors)]) #6 constraints in the vector return [S]

def initializeG(self):
''' Initialize the general boundary '''
G = tuple(['?' for factor in range(self.num_factors)]) # 6 constraints in the vector return [G]

def is_positive(self,trial_set):
''' Check if a given training trial_set is positive ''' if trial_set[1] == 'Y':

return True
elif trial_set[1] == 'N': return False
else:
raise TypeError("invalid target value")
def match_factor(self,value1,value2):
''' Check for the factors values match, necessary while checking the consistency of training trial_set with
the hypothesis '''
if value1 == '?' or value2 == '?':
return True
elif value1 == value2 :
return True
return False

def consistent(self,hypothesis,instance):
''' Check whether the instance is part of the hypothesis ''' for i,factor in enumerate(hypothesis):
if not self.match_factor(factor,instance[i]): return False
return True
def remove_inconsistent_G(self,hypotheses,instance): ''' For a positive trial_set, the hypotheses in G
inconsistent with it should be removed ''' G_new = hypotheses[:]

for g in hypotheses:
if not self.consistent(g,instance): G_new.remove(g)
return G_new

def remove_inconsistent_S(self,hypotheses,instance): ''' For a negative trial_set, the hypotheses in S

inconsistent with it should be removed ''' S_new = hypotheses[:]
for s in hypotheses:
if self.consistent(s,instance): S_new.remove(s)
return S_new

def remove_more_general(self,hypotheses):
''' After generalizing S for a positive trial_set, the hypothesis in S general than others in S should be
removed '''
S_new = hypotheses[:] for old in hypotheses:

for new in S_new:

if old!=new and self.more_general(new,old): S_new.remove[new]
return S_new

def remove_more_specific(self,hypotheses):
''' After specializing G for a negative trial_set, the hypothesis in G specific than others in G should be
removed '''
G_new = hypotheses[:] for old in hypotheses: for new in G_new:
if old!=new and self.more_specific(new,old): G_new.remove[new]
return G_new

def generalize_inconsistent_S(self,hypothesis,instance):
''' When a inconsistent hypothesis for positive trial_set is seen in the specific boundary S,
it should be generalized to be consistent with the trial_set ... we will get one hypothesis'''
hypo = list(hypothesis) # convert tuple to list for mutability for i,factor in enumerate(hypo):
if factor == '-':
hypo[i] = instance[i]
elif not self.match_factor(factor,instance[i]): hypo[i] = '?'
generalization = tuple(hypo) # convert list back to tuple for immutability return generalization

def specialize_inconsistent_G(self,hypothesis,instance):
''' When a inconsistent hypothesis for negative trial_set is seen in the general boundary G
should be specialized to be consistent with the trial_set.. we will get a set of hypotheses '''
specializations = []
hypo = list(hypothesis) # convert tuple to list for mutability for i,factor in enumerate(hypo):
if factor == '?':
values = self.factors[self.attr[i]] for j in values:
if instance[i] != j: hyp=hypo[:] hyp[i]=j
hyp=tuple(hyp) # convert list back to tuple for immutability specializations.append(hyp)
return specializations

def get_general(self,generalization,G):
''' Checks if there is more general hypothesis in G
for a generalization of inconsistent hypothesis in S
in case of positive trial_set and returns valid generalization '''

for g in G:
if self.more_general(g,generalization): return generalization
return None

def get_specific(self,specializations,S):
''' Checks if there is more specific hypothesis in S for each of hypothesis in specializations of an
inconsistent hypothesis in G in case of negative trial_set and return the valid specializations'''
valid_specializations = [] for hypo in specializations:
for s in S:
if self.more_specific(s,hypo) or s==self.initializeS()[0]: valid_specializations.append(hypo)
return valid_specializations

def exists_general(self,hypothesis,G):
'''Used to check if there exists a more general hypothesis in general boundary for version space'''

for g in G:
if self.more_general(g,hypothesis): return True
return False

def exists_specific(self,hypothesis,S):
'''Used to check if there exists a more specific hypothesis in general boundary for version space'''

for s in S:
if self.more_specific(s,hypothesis): return True
return False

def more_general(self,hyp1,hyp2):
''' Check whether hyp1 is more general than hyp2 ''' hyp = zip(hyp1,hyp2)
for i,j in hyp: if i == '?':
continue

elif j == '?':
if i != '?': return False
elif i != j: return False
else:
continue return True

def more_specific(self,hyp1,hyp2): ''' hyp1 more specific than hyp2 is

equivalent to hyp2 being more general than hyp1 ''' return self.more_general(hyp2,hyp1)

dataset=[(('sunny','warm','normal','strong','warm','same'),'Y'),(('sunny','warm','high','strong','warm','same')
,'Y'),(('rainy','cold','high','strong','warm','change'),'N'),(('sunny','warm','hi gh','strong','cool','change'),'Y')]
attributes =('Sky','Temp','Humidity','Wind','Water','Forecast') f = Holder(attributes)
f.add_values('Sky',('sunny','rainy','cloudy')) #sky can be sunny rainy or cloudy
f.add_values('Temp',('cold','warm')) #Temp can be sunny cold or warm
f.add_values('Humidity',('normal','high')) #Humidity can be normal or high
f.add_values('Wind',('weak','strong')) #wind can be weak or strong f.add_values('Water',('warm','cold'))
#water can be warm or cold f.add_values('Forecast',('same','change')) #Forecast can be same or change
a = CandidateElimination(dataset,f) #pass the dataset to the algorithm class and call the run algoritm
method
a.run_algorithm()

Output
[('sunny', 'warm', 'normal', 'strong', 'warm', 'same')]
[('sunny', 'warm', 'normal', 'strong', 'warm','same')]
[('sunny', 'warm', '?', 'strong', 'warm', 'same')]
[('?', '?', '?', '?', '?', '?')]
[('sunny', '?', '?', '?', '?', '?'), ('?', 'warm', '?', '?', '?', '?'), ('?', '?', '?', '?', '?', 'same')]
[('sunny', 'warm', '?', 'strong', 'warm', 'same')]
[('sunny', 'warm', '?', 'strong', '?', '?')]
[('sunny', 'warm', '?', 'strong', '?', '?')]
[('sunny', '?', '?', '?', '?', '?'), ('?', 'warm', '?', '?', '?', '?')]

Excercise-3: Write a program to demonstrate the working of the decision tree based
ID3 algorithm. Use an appropriate data set for building the decision tree and apply this
knowledge to classify a new sample.
Program Code:
import numpy as np import math
from data_loader import read_data

class Node:
def init (self, attribute): self.attribute = attribute self.children = [] self.answer = ""

def str (self): return self.attribute

def subtables(data, col, delete): dict = {}
items = np.unique(data[:, col])
count = np.zeros((items.shape[0], 1), dtype=np.int32) for x in range(items.shape[0]):
for y in range(data.shape[0]):
if data[y, col] == items[x]: count[x] += 1

for x in range(items.shape[0]):
dict[items[x]] = np.empty((int(count[x]), data.shape[1]), dtype="|S32")

pos = 0
for y in range(data.shape[0]): if data[y, col] == items[x]:
dict[items[x]][pos] = data[y] pos += 1

if delete:
dict[items[x]] = np.delete(dict[items[x]], col, 1) return items, dict
def entropy(S):
items = np.unique(S) if items.size == 1:

return 0

counts = np.zeros((items.shape[0], 1)) sums = 0

for x in range(items.shape[0]):

counts[x] = sum(S == items[x]) / (S.size * 1.0)

for count in counts:
sums += -1 * count * math.log(count, 2) return sums

def gain_ratio(data, col):

items, dict = subtables(data, col, delete=False)

total_size = data.shape[0]
entropies = np.zeros((items.shape[0], 1)) intrinsic = np.zeros((items.shape[0], 1)) for x in
range(items.shape[0]):
ratio = dict[items[x]].shape[0]/(total_size * 1.0) entropies[x] = ratio * entropy(dict[items[x]][:, -1])
intrinsic[x] = ratio * math.log(ratio, 2)

total_entropy = entropy(data[:, -1]) iv = -1 * sum(intrinsic)

for x in range(entropies.shape[0]): total_entropy -= entropies[x]

return total_entropy / iv
def create_node(data, metadata):
if (np.unique(data[:, -1])).shape[0] == 1: node = Node("")
node.answer = np.unique(data[:, -1])[0] return node

gains = np.zeros((data.shape[1] - 1, 1)) for col in range(data.shape[1] - 1):

gains[col] = gain_ratio(data, col) split = np.argmax(gains)

node = Node(metadata[split])

metadata = np.delete(metadata, split, 0)

items, dict = subtables(data, split, delete=True)

for x in range(items.shape[0]):
child = create_node(dict[items[x]], metadata) node.children.append((items[x], child))
return node def empty(size):
s = ""
for x in range(size): s += " "
return s

def print_tree(node, level): if node.answer != "":

print(empty(level), node.answer) return
print(empty(level), node.attribute) for value, n in node.children:
print(empty(level + 1), value) print_tree(n, level + 2)

metadata, traindata = read_data("tennis.csv") data = np.array(traindata)

node = create_node(data, metadata) print_tree(node, 0)

Data_loader.py
import csv
def read_data(filename):
with open(filename, 'r') as csvfile:
datareader = csv.reader(csvfile, delimiter=',') headers = next(datareader)
metadata = [] traindata = []
for name in headers: metadata.append(name)
for row in datareader: traindata.append(row)

return (metadata, traindata)

Tennis.csv
outlook,temperature,humidity,wind, answer sunny,hot,high,weak,no sunny,hot,high,strong,no
overcast,hot,high,weak,yes rain,mild,high,weak,yes rain,cool,normal,weak,yes rain,cool,normal,strong,no
overcast,cool,normal,strong,yes sunny,mild,high,weak,no sunny,cool,normal,weak,yes
rain,mild,normal,weak,yes sunny,mild,normal,strong,yes overcast,mild,high,strong,yes
overcast,hot,normal,weak,yes rain,mild,high,strong,no
Output
outlook
overcast
b'yes'
rain
wind
b'strong'
b'no'
b'weak'
b'yes'
sunny
humidity
b'high'
b'no'
b'normal'
b'yes

Excercise-4.Build a prediction model to perform logistic regression

Program Code:

# Importing the required modules and classes

from sklearn.datasets import load_iris
from sklearn.metrics import accuracy_score
from sklearn.linear_model import LogisticRegression
from sklearn.model_selection import train_test_split

# Loading our dataset

data = load_iris()
# Splitting the independent and dependent variables
X = data.data
Y = data.target
print("The size of the complete dataset is: ", len(X))

# Creating an instance of LogisticRegression class for implementing logistic regression

log_reg = LogisticRegression()

# Segregating the training and testing dataset

X_train, X_test, Y_train, Y_test = train_test_split(X, Y, test_size=0.3, random_state = 10)
# Performing the logistic regression on train dataset
log_reg.fit(X_train, Y_train)

# Printing the accuracy score

print("Accuracy score of the predictions made by the model: ", accuracy_score(log_reg.predict(X_te
st), Y_test))
Output

The size of the complete dataset is: 150

Excercise-5: Build an Artificial Neural Network by implementing the Back propagation

algorithm and test the same using appropriate data sets.
Program Code:

import numpy as np

X = np.array(([2, 9], [1, 5], [3, 6]), dtype=float)

y = np.array(([92], [86], [89]), dtype=float)
X = X/np.amax(X,axis=0) #maximum of X array longitudinally
y = y/100

#Sigmoid Function
def sigmoid (x):
return 1/(1 + np.exp(-x))
#Derivative of Sigmoid Function
def derivatives_sigmoid(x):
return x * (1 - x)
#Variable initialization
epoch=5 #Setting training iterations
lr=0.1 #Setting learning rate
inputlayer_neurons = 2 #number of features in data set
hiddenlayer_neurons = 3 #number of hidden layers neurons
output_neurons = 1 #number of neurons at output layer
#weight and bias initialization

wh=np.random.uniform(size=(inputlayer_neurons,hiddenlayer_neurons))
bh=np.random.uniform(size=(1,hiddenlayer_neurons))
wout=np.random.uniform(size=(hiddenlayer_neurons,output_neurons))
bout=np.random.uniform(size=(1,output_neurons))

#draws a random range of numbers uniformly of dim x*y

for i in range(epoch):
#Forward Propogation
hinp1=np.dot(X,wh)
hinp=hinp1 + bh
hlayer_act = sigmoid(hinp)
outinp1=np.dot(hlayer_act,wout)
outinp= outinp1+bout
output = sigmoid(outinp)

#Backpropagation
EO = y-output
outgrad = derivatives_sigmoid(output)
d_output = EO * outgrad
EH = d_output.dot(wout.T)
hiddengrad = derivatives_sigmoid(hlayer_act)#how much hidden layer wts contributed to error
d_hiddenlayer = EH * hiddengrad

wout += hlayer_act.T.dot(d_output) *lr # dotproduct of nextlayererror and currentlayerop

wh += X.T.dot(d_hiddenlayer) *lr

print ("-----------Epoch-", i+1, "Starts----------")

print("Input: \n" + str(X))
print("Actual Output: \n" + str(y))
print("Predicted Output: \n" ,output)
print ("-----------Epoch-", i+1, "Ends----------\n")

print("Input: \n" + str(X))

print("Actual Output: \n" + str(y))
print("Predicted Output: \n" ,output)
Training Examples:
Example Sleep Study Expected % in Exams

1 2 9 92

2 1 5 86

3 3 6 89
Normalize the input
Example Sleep Study Expected % in Exams

1 2/3 = 0.66666667 9/9 = 1 0.92

2 1/3 = 0.33333333 5/9 = 0.55555556 0.86

3 3/3 = 1 6/9 = 0.66666667 0.89

Output:
———–Epoch- 1 Starts———-
Input:
[[0.66666667 1. ]
[0.33333333 0.55555556]
[1. 0.66666667]]
Actual Output:
[[0.92]
[0.86]
[0.89]]
Predicted Output:
[[0.81951208]
[0.8007242 ]
[0.82485744]]
———–Epoch- 1 Ends———-

Excercise-6: Write a program to implement the naı̈ve Bayesian classifier for

a sample training data set stored as a .CSV file.
Import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.naive_bayes import GaussianNB
from sklearn.metrics import accuracy_score

# Load the dataset

def load_data(file_path):
df = pd.read_csv(file_path)
return df

# Preprocess the data

def preprocess_data(data):
# Assuming the last column is the target variable and the rest are features
X = data.iloc[:, :-1]
y = data.iloc[:, -1]
return X, y

# Split the dataset into training and testing sets

def split_data(X, y, test_size=0.2, random_state=42):
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=test_size, random_state=random_state)
return X_train, X_test, y_train, y_test

# Train the naive Bayes classifier

def train_naive_bayes(X_train, y_train):
model = GaussianNB()
model.fit(X_train, y_train)
return model

# Make predictions on the test set

def make_predictions(model, X_test):
predictions = model.predict(X_test)
return predictions
# Evaluate the accuracy of the model
def evaluate_accuracy(y_true, y_pred):
accuracy = accuracy_score(y_true, y_pred)
return accuracy

if __name__ == "__main__":
# Change the file path to the location of your CSV file
file_path = "your_dataset.csv"

# Load and preprocess the data

data = load_data(file_path)
X, y = preprocess_data(data)

# Split the data into training and testing sets

X_train, X_test, y_train, y_test = split_data(X, y)

# Train the naive Bayes classifier

model = train_naive_bayes(X_train, y_train)

# Make predictions on the test set

predictions = make_predictions(model, X_test)

# Evaluate the accuracy of the model

accuracy = evaluate_accuracy(y_test, predictions)

print(f"Accuracy: {accuracy}")

output:

Accuracy: 0.85

Excercise-7: Assuming a set of documents that need to be classified, use the na¨ıve
Bayesian Classifier model to perform this task. Built-in Java classes/API can be used to
write the program. Calculate the accuracy, precision, and recall for your data set.
Program Code:
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.feature_extraction.text import CountVectorizer
from sklearn.naive_bayes import MultinomialNB
from sklearn import metrics

msg=pd.read_csv('naivetext.csv',names=['message','label'])

print('The dimensions of the dataset',msg.shape)

msg['labelnum']=msg.label.map({'pos':1,'neg':0})
X=msg.message
y=msg.labelnum

#splitting the dataset into train and test data

xtrain,xtest,ytrain,ytest=train_test_split(X,y)
print ('\n the total number of Training Data :',ytrain.shape)
print ('\n the total number of Test Data :',ytest.shape)

#output the words or Tokens in the text documents

cv = CountVectorizer()
xtrain_dtm = cv.fit_transform(xtrain)
xtest_dtm=cv.transform(xtest)
print('\n The words or Tokens in the text documents \n')
print(cv.get_feature_names())
df=pd.DataFrame(xtrain_dtm.toarray(),columns=cv.get_feature_names())

# Training Naive Bayes (NB) classifier on training data.

clf = MultinomialNB().fit(xtrain_dtm,ytrain)
predicted = clf.predict(xtest_dtm)

#printing accuracy, Confusion matrix, Precision and Recall

print('\n Accuracy of the classifier is',metrics.accuracy_score(ytest,predicted))
print('\n Confusion matrix')
print(metrics.confusion_matrix(ytest,predicted))
print('\n The value of Precision', metrics.precision_score(ytest,predicted))
print('\n The value of Recall', metrics.recall_score(ytest,predicted))
Output:

The dimensions of the dataset (18, 2)

1. I love this sandwich
2. This is an amazing place
3. I feel very good about these beers
4. This is my best work
5. What an awesome view
6. I do not like this restaurant
7. I am tired of this stuff
8. I can’t deal with this
9. He is my sworn enemy
10. My boss is horrible
11. This is an awesome place
12. I do not like the taste of this juice
13. I love to dance
14. I am sick and tired of this place
15. What a great holiday
16. That is a bad locality to stay
17. We will have good fun tomorrow
18. I went to my enemy’s house today
Name: message, dtype: object 0 1
1 1
2 1
3 1
4 1
5 0
6 0
7 0
8 0
9 0
10 1
11 0
12 1
13 0
14 1
15 0
16 1
17 0
Name: labelnum, dtype: int64
The total number of Training Data: (13,) The total number of Test Data: (5,)
The words or Tokens in the text documents
[‘about’, ‘am’, ‘amazing’, ‘an’, ‘and’, ‘awesome’, ‘beers’, ‘best’, ‘can’, ‘deal’, ‘do’, ‘enemy’, ‘feel’, ‘fun’,
‘good’, ‘great’, ‘have’, ‘he’, ‘holiday’, ‘house’, ‘is’, ‘like’, ‘love’, ‘my’, ‘not’, ‘of’, ‘place’,‘restaurant’,
‘sandwich’, ‘sick’, ‘sworn’, ‘these’, ‘this’, ‘tired’, ‘to’, ‘today’, ‘tomorrow’, ‘very’, ‘view’, ‘we’, ‘went’,
‘what’, ‘will’, ‘with’, ‘work’]
Accuracy of the classifier is 0.8
Confusion matrix
[[2 1]
[0 2]]
The value of Precision 0.6666666666666666
The value of Recall 1.0

Excercise-8: Write a Python program to construct a Bayesian network considering

medical data. Use this model to demonstrate the diagnosis of heart patients using
standard Heart Disease Data Set.
Program Code:
import pandas as pd
from pgmpy.estimators import MaximumLikelihoodEstimator
from pgmpy.models import BayesianModel
from pgmpy.inference import VariableElimination

data = pd.read_csv("ds4.csv")
heart_disease = pd.DataFrame(data)
print(heart_disease)

model = BayesianModel([
('age', 'Lifestyle'),
('Gender', 'Lifestyle'),
('Family', 'heartdisease'),
('diet', 'cholestrol'),
('Lifestyle', 'diet'),
('cholestrol', 'heartdisease'),
('diet', 'cholestrol')
])

model.fit(heart_disease, estimator=MaximumLikelihoodEstimator)

HeartDisease_infer = VariableElimination(model)

print('For Age enter SuperSeniorCitizen:0, SeniorCitizen:1, MiddleAged:2, Youth:3, Teen:4')

print('For Gender enter Male:0, Female:1')
print('For Family History enter Yes:1, No:0')
print('For Diet enter High:0, Medium:1')
print('for LifeStyle enter Athlete:0, Active:1, Moderate:2, Sedentary:3')
print('for Cholesterol enter High:0, BorderLine:1, Normal:2')

q = HeartDisease_infer.query(variables=['heartdisease'], evidence={
'age': int(input('Enter Age: ')),
'Gender': int(input('Enter Gender: ')),
'Family': int(input('Enter Family History: ')),
'diet': int(input('Enter Diet: ')),
'Lifestyle': int(input('Enter Lifestyle: ')),
'cholestrol': int(input('Enter Cholestrol: '))
})

print(q)

Output:

S.no Age Gender Family Diet Lifestyle Cholestrol Heartdisease

0 0 0 1 1 3 0 1
1 0 1 1 1 3 0 1
2 1 0 0 0 2 1 1
3 4 0 1 1 3 2 0
4 3 1 1 0 0 2 0
5 2 0 1 1 1 0 1
6 4 0 1 0 2 0 1
7 0 0 1 1 3 0 1
8 3 1 1 0 0 2 0
9 1 1 0 0 0 2 1
10 4 1 0 1 2 0 1
11 4 0 1 1 3 2 0
12 2 1 0 0 0 0 0
13 2 0 1 1 1 0 1
14 3 1 1 0 0 1 0
15 0 0 1 0 0 2 1
16 1 1 0 1 2 1 1
17 3 1 1 1 0 1 0
18 4 0 1 1 3 2 0

For Age enter SuperSeniorCitizen:0, SeniorCitizen:1, MiddleAged:2, Youth:3, Teen:4

For Gender enter Male:0, Female:1
For Family History enter Yes:1, No:0
For Diet enter High:0, Medium:1
For Lifestyle enter Athlete:0, Active:1, Moderate:2, Sedentary:3
For Cholesterol enter High:0, BorderLine:1, Normal:2
Enter Age: 0
Enter Gender: 0
Enter Family History: 0
Enter Diet: 0
Enter Lifestyle: 3
Enter Cholestrol: 0
+-----------------+---------------------+
| heartdisease | phi(heartdisease) |
+=================+=====================+
| heartdisease(0) | 0.5000 |
+-----------------+---------------------+
| heartdisease(1) | 0.5000 |
+-----------------+---------------------+
Finding Elimination Order: : : 0it [00:00, ?it/s]
0it [00:00, ?it/s]

Excercise-9: Perform clustering using k-means clustering algorithm.

# Import necessary libraries
import numpy as np
import matplotlib.pyplot as plt
from sklearn.cluster import KMeans
from sklearn.datasets import make_blobs

# Generate synthetic data for demonstration

data, _ = make_blobs(n_samples=300, centers=4, random_state=42)

# Instantiate the KMeans model with the desired number of clusters

kmeans = KMeans(n_clusters=4)

# Fit the model to the data

kmeans.fit(data)

# Get cluster centers and labels

centers = kmeans.cluster_centers_
labels = kmeans.labels_

# Plot the data points and cluster centers

plt.scatter(data[:, 0], data[:, 1], c=labels, cmap='viridis', edgecolor='k')
plt.scatter(centers[:, 0], centers[:, 1], c='red', marker='X', s=200, label='Cluster Centers')
plt.title('K-Means Clustering')
plt.legend()
plt.show()

output:

<img src="https://i.imgur.com/Kj7ZK0k.png" alt="K-Means Clustering" width="600"/>

Excercise-10: Write a program to implement k-Nearest Neighbor algorithm to classify
the iris data set. Print both correct and wrong predictions.
Program Code:
import numpy as np
import pandas as pd
from sklearn.neighbors import KNeighborsClassifier
from sklearn.model_selection import train_test_split
from sklearn import metrics

names = ['sepal-length', 'sepal-width', 'petal-length', 'petal-width', 'Class']

# Read dataset to pandas dataframe

dataset = pd.read_csv("9-dataset.csv", names=names)
X = dataset.iloc[:, :-1]
y = dataset.iloc[:, -1]
print(X.head())
Xtrain, Xtest, ytrain, ytest = train_test_split(X, y, test_size=0.10)

classifier = KNeighborsClassifier(n_neighbors=5).fit(Xtrain, ytrain)

ypred = classifier.predict(Xtest)

i=0
print ("\n-------------------------------------------------------------------------")
print ('%-25s %-25s %-25s' % ('Original Label', 'Predicted Label', 'Correct/Wrong'))
print ("-------------------------------------------------------------------------")
for label in ytest:
print ('%-25s %-25s' % (label, ypred[i]), end="")
if (label == ypred[i]):
print (' %-25s' % ('Correct'))
else:
print (' %-25s' % ('Wrong'))
i=i+1
print ("-------------------------------------------------------------------------")
print("\nConfusion Matrix:\n",metrics.confusion_matrix(ytest, ypred))
print ("-------------------------------------------------------------------------")
print("\nClassification Report:\n",metrics.classification_report(ytest, ypred))
print ("-------------------------------------------------------------------------")
print('Accuracy of the classifer is %0.2f' % metrics.accuracy_score(ytest,ypred))
print ("-------------------------------------------------------------------------")

Output
sepal-length sepal-width petal-length petal-width
0 5.1 3.5 1.4 0.2
1 4.9 3.0 1.4 0.2
2 4.7 3.2 1.3 0.2
3 4.6 3.1 1.5 0.2
4 5.0 3.6 1.4 0.2

-------------------------------------------------------------------------
Original Label Predicted Label Correct/Wrong
-------------------------------------------------------------------------
Iris-versicolor Iris-versicolor Correct
Iris-virginica Iris-versicolor Wrong
Iris-virginica Iris-virginica Correct
Iris-versicolor Iris-versicolor Correct
Iris-setosa Iris-setosa Correct
Iris-versicolor Iris-versicolor Correct
Iris-setosa Iris-setosa Correct
Iris-setosa Iris-setosa Correct
Iris-virginica Iris-virginica Correct
Iris-virginica Iris-versicolor Wrong
Iris-virginica Iris-virginica Correct
Iris-setosa Iris-setosa Correct
Iris-virginica Iris-virginica Correct
Iris-virginica Iris-virginica Correct
Iris-versicolor Iris-versicolor Correct
-------------------------------------------------------------------------

Confusion Matrix:
[[4 0 0]
[0 4 0]
[0 2 5]]
-------------------------------------------------------------------------

Classification Report:
precision recall f1-score support

Iris-setosa 1.00 1.00 1.00 4

Iris-versicolor 0.67 1.00 0.80 4
Iris-virginica 1.00 0.71 0.83 7

avg / total 0.91 0.87 0.87 15

-------------------------------------------------------------------------
Accuracy of the classifer is 0.87
-------------------------------------------------------------------------

Excercise-11: Implement the non-parametric Locally Weighted Regression algorithm in

order to fit data points. Select appropriate data set for your experiment and draw
graphs.
Program Code:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline
df = pd.read_csv('/content/tips.csv')
features = np.array(df.total_bill)
labels = np.array(df.tip)

def kernel(data, point, xmat, k):

m,n = np.shape(xmat)
ws = np.mat(np.eye((m)))
for j in range(m):
diff = point - data[j]
ws[j,j] = np.exp(diff*diff.T/(-2.0*k**2))
return ws

def local_weight(data, point, xmat, ymat, k):

wei = kernel(data, point, xmat, k)
return (data.T*(wei*data)).I*(data.T*(wei*ymat.T))

def local_weight_regression(xmat, ymat, k):

m,n = np.shape(xmat)
ypred = np.zeros(m)
for i in range(m):
ypred[i] = xmat[i]*local_weight(xmat, xmat[i],xmat,ymat,k)
return ypred
m = features.shape[0]
mtip = np.mat(labels)
data = np.hstack((np.ones((m, 1)), np.mat(features).T))
ypred = local_weight_regression(data, mtip, 0.5)
indices = data[:,1].argsort(0)
xsort = data[indices][:,0]
fig = plt.figure()
ax = fig.add_subplot(1,1,1)
ax.scatter(features, labels, color='blue')
ax.plot(xsort[:,1],ypred[indices], color = 'red', linewidth=3)
plt.xlabel('Total bill')
plt.ylabel('Tip')
plt.show()

Output:

Associate Cloud Engineer Certification - Learn - Google Cloud
No ratings yet
Associate Cloud Engineer Certification - Learn - Google Cloud
5 pages
Paranoia XP - Crash Priority
100% (9)
Paranoia XP - Crash Priority
67 pages
Machine Learning Lab
No ratings yet
Machine Learning Lab
31 pages
Machine Learning Lab Mannual CS 601
No ratings yet
Machine Learning Lab Mannual CS 601
30 pages
B.TECH Machine Learning-Lab
No ratings yet
B.TECH Machine Learning-Lab
99 pages
IT 804
No ratings yet
IT 804
33 pages
ML-LAB-MANUAL-R20
No ratings yet
ML-LAB-MANUAL-R20
77 pages
ML LAB
No ratings yet
ML LAB
51 pages
Edited - Edited - Final ML Lab Manual Version11
No ratings yet
Edited - Edited - Final ML Lab Manual Version11
83 pages
MLT LAB1
No ratings yet
MLT LAB1
27 pages
ML Lab
No ratings yet
ML Lab
49 pages
Machine_learning_laboratory
No ratings yet
Machine_learning_laboratory
44 pages
Machine Learning Techniques LAB FILE - KAI651
No ratings yet
Machine Learning Techniques LAB FILE - KAI651
16 pages
Lab Manual Final
No ratings yet
Lab Manual Final
34 pages
Machine Learning Lab Manual
No ratings yet
Machine Learning Lab Manual
26 pages
ML Lab Manual
No ratings yet
ML Lab Manual
26 pages
ML_LAB Record_final
No ratings yet
ML_LAB Record_final
39 pages
Machine Learning Techniques Lab: Session: 2023-24, Even Semester
No ratings yet
Machine Learning Techniques Lab: Session: 2023-24, Even Semester
20 pages
My ML Lab Manual
No ratings yet
My ML Lab Manual
21 pages
ML LAB PROGRAMS
No ratings yet
ML LAB PROGRAMS
42 pages
Machine Learning Lab Manual
No ratings yet
Machine Learning Lab Manual
30 pages
FINAL LAB PROGRAMS (2)
No ratings yet
FINAL LAB PROGRAMS (2)
52 pages
Cat 2 Document Likkitha
No ratings yet
Cat 2 Document Likkitha
80 pages
Machine learning
No ratings yet
Machine learning
27 pages
ml lab(print copy)
No ratings yet
ml lab(print copy)
23 pages
ML RECORD NEW FORMAT
No ratings yet
ML RECORD NEW FORMAT
48 pages
Machine Learning Lab Manual (15CSL76)
No ratings yet
Machine Learning Lab Manual (15CSL76)
30 pages
201CS240-MLLABMANUAL
No ratings yet
201CS240-MLLABMANUAL
20 pages
ML-LAB-MANUAL-R20-1
No ratings yet
ML-LAB-MANUAL-R20-1
63 pages
22K61A0618_removed_lab manual sasi cld
No ratings yet
22K61A0618_removed_lab manual sasi cld
25 pages
ML Lab - 231009 - 210335
No ratings yet
ML Lab - 231009 - 210335
38 pages
ML Lab PFG - Removed - Removed - Removed
No ratings yet
ML Lab PFG - Removed - Removed - Removed
22 pages
ML Lab Manual Devansh (1)
No ratings yet
ML Lab Manual Devansh (1)
57 pages
ML LAB record[1]
No ratings yet
ML LAB record[1]
35 pages
ML Lab Manual - Ex No. 1 To 9
No ratings yet
ML Lab Manual - Ex No. 1 To 9
26 pages
Shashidhar-18csl76 Final
No ratings yet
Shashidhar-18csl76 Final
19 pages
AD3461_ML Lab Manual
No ratings yet
AD3461_ML Lab Manual
54 pages
Machine Learning Lab (17CSL76)
No ratings yet
Machine Learning Lab (17CSL76)
48 pages
ML Lab Manual PDF
No ratings yet
ML Lab Manual PDF
9 pages
ML Lab Manual-2019
No ratings yet
ML Lab Manual-2019
85 pages
ML Lab Manual R20 1
No ratings yet
ML Lab Manual R20 1
63 pages
Machine Learning Lab Manual
No ratings yet
Machine Learning Lab Manual
43 pages
6CS4-22 Machine Learning Lab
No ratings yet
6CS4-22 Machine Learning Lab
30 pages
AD3461-Machine Learning Lab Manual
No ratings yet
AD3461-Machine Learning Lab Manual
26 pages
ML Lab Programs
No ratings yet
ML Lab Programs
18 pages
ML Lab Manual (1-9)
No ratings yet
ML Lab Manual (1-9)
37 pages
Ml_Lab_Manual
No ratings yet
Ml_Lab_Manual
70 pages
Machine Learning Lab Record: Dr. Sarika Hegde
No ratings yet
Machine Learning Lab Record: Dr. Sarika Hegde
23 pages
ML New record (5)
No ratings yet
ML New record (5)
51 pages
School of Engineering: Lab Manual On Machine Learning Lab
No ratings yet
School of Engineering: Lab Manual On Machine Learning Lab
23 pages
Jntuk R20 ML
No ratings yet
Jntuk R20 ML
43 pages
Fedal #5
No ratings yet
Fedal #5
33 pages
ML Lab
No ratings yet
ML Lab
7 pages
MLlab Manual LIET
No ratings yet
MLlab Manual LIET
52 pages
15CSL76 Students
No ratings yet
15CSL76 Students
18 pages
ML Experiments
No ratings yet
ML Experiments
22 pages
24CSPC212-PIC Lab Manual
No ratings yet
24CSPC212-PIC Lab Manual
45 pages
ML Record
No ratings yet
ML Record
18 pages
ML MANUAL (1)
No ratings yet
ML MANUAL (1)
74 pages
MANUAL (1)
No ratings yet
MANUAL (1)
34 pages
Advanced C++ Interview Questions You'll Most Likely Be Asked
From Everand
Advanced C++ Interview Questions You'll Most Likely Be Asked
Vibrant Publishers
No ratings yet
Python For Beginners
From Everand
Python For Beginners
Célio Azevedo
No ratings yet
Sign Language Translation Using Machine Learning and Computer Vis
No ratings yet
Sign Language Translation Using Machine Learning and Computer Vis
34 pages
Log
No ratings yet
Log
95 pages
Untitled Document
No ratings yet
Untitled Document
8 pages
Fiji National University: Namaka Library
No ratings yet
Fiji National University: Namaka Library
6 pages
SweetFX - Settings - The Elder Scrolls V - Skyrim - Quentin's - Beautiful 1.5 FX
No ratings yet
SweetFX - Settings - The Elder Scrolls V - Skyrim - Quentin's - Beautiful 1.5 FX
9 pages
WebSecurityM Aoori
No ratings yet
WebSecurityM Aoori
5 pages
100- Companies Hiring in Bangalore - Fresherscareer
No ratings yet
100- Companies Hiring in Bangalore - Fresherscareer
6 pages
Fuzzy Classification
No ratings yet
Fuzzy Classification
12 pages
Chapter 5
No ratings yet
Chapter 5
20 pages
Brt Syllabus
No ratings yet
Brt Syllabus
6 pages
Epson BIJ Brochure WF Enterprise-AM C4000.C5000.C6000 12pp 1022 EN
No ratings yet
Epson BIJ Brochure WF Enterprise-AM C4000.C5000.C6000 12pp 1022 EN
6 pages
DWDM All Units
No ratings yet
DWDM All Units
102 pages
Raspberry Pi - Case Study
No ratings yet
Raspberry Pi - Case Study
18 pages
COMP 552 Introduction To Cybersecurity Winter 2021: Page 1 of 3
No ratings yet
COMP 552 Introduction To Cybersecurity Winter 2021: Page 1 of 3
3 pages
Narrative-Part-1 ni moyong final docx
No ratings yet
Narrative-Part-1 ni moyong final docx
23 pages
Data Structure Unit-4
No ratings yet
Data Structure Unit-4
23 pages
Jurnal Pendidikan IPA Indonesia
No ratings yet
Jurnal Pendidikan IPA Indonesia
2 pages
No Software Will Be Installed or Removed.: Installation Summary
No ratings yet
No Software Will Be Installed or Removed.: Installation Summary
1 page
Netflix Gives Users Exactly What They Want - Every Time: Personalization
No ratings yet
Netflix Gives Users Exactly What They Want - Every Time: Personalization
2 pages
Shopping Cart Literature Review
100% (1)
Shopping Cart Literature Review
8 pages
User Manual CropSurveyMobileApplication Final
No ratings yet
User Manual CropSurveyMobileApplication Final
78 pages
Christmas Homework Pass Template
100% (1)
Christmas Homework Pass Template
7 pages
Cybersecurityanalysis
No ratings yet
Cybersecurityanalysis
12 pages
Last Edited Answer
No ratings yet
Last Edited Answer
6 pages
MKWI4201 Bahasa Inggris
No ratings yet
MKWI4201 Bahasa Inggris
4 pages
Java Black Book
No ratings yet
Java Black Book
554 pages
EpsonAcuLaser CX37DN
No ratings yet
EpsonAcuLaser CX37DN
4 pages
Automation in Construction: Sciencedirect
No ratings yet
Automation in Construction: Sciencedirect
19 pages