Assignment 2
Assignment 2
Science________________________________________________
Session: Sprint-2024 Course Instructor: ___Shoukat
Ali____________________
Subject: __Machine Learning__ Course Code: __________ Max. Marks:
___5__
Class/Sec.: 8-C Submission Date: 06/15/24 Time Duration: () From: ____ to ______
Assignment 02
Apply following machine learning classifier/algorithm on PIMA Indian diabetic database to predict whether
patients in datasets have diabetes or not.
1. Logistics regression
2. Decision tree
3. Random forest
4. Naive Byes
5. KNN
6. SVM
import pandas as pd
data = pd.read_csv('diabetes.csv')
X = data.drop('Outcome', axis=1)
y = data['Outcome']
# Split data
pipelines = {
y_pred = pipeline.predict(X_test)
print(f'\n{name}:')
Study:
Support
Characteristi Logistic Decision Random K-Nearest Vector
c Regression Tree Forest Naive Bayes Neighbors Machine
Simple,
interpretable, Powerful,
handles accurate, less Non-
linearly Interpretable, prone to parametric, Effective in
separable visualizes overfitting, simple, can high-
data, efficient decision handles non- learn complex dimensional
with large rules, handles linear Simple, fast, decision spaces,
datasets, non-linear relationships, handles high- boundaries, flexible kernel
benefits from relationships, works well dimensional works well choice, works
feature useful for with data, good for with well with
Winning scaling/outlier feature standardized categorical standardized standardized
Qualities handling selection features features features features
Computationa
lly expensive Sensitive to
Assumes Prone to for large hyperparamet
linear overfitting, Assumes datasets, ers, less
relationships, sensitive to feature requires interpretable,
less accurate small data Less independenc careful tuning computationall
with complex changes, may interpretable, e, sensitive to of k, sensitive y demanding
Areas for decision not generalize computational data to irrelevant with large
Improvement boundaries well ly demanding distribution features datasets
Performance
on Pima
Indians 75-80% 70-75% 75-82% 70-75% 72-78% 75-82%
_____________________________________________________________________________________
BEST OF LUCK