0% found this document useful (0 votes)

11 views22 pages

Report - Mini ProjectFINAL

mmm

Uploaded by

Hrithick M

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

11 views22 pages

Report - Mini ProjectFINAL

mmm

Uploaded by

Hrithick M

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 22

HEART DISEASE DETECTION USING ML

PYTHON PROJECT REPORT

Submitted by

HRITHICK RAM.M [2303722810621043]

BACHELOR OF ENGINEERING

in
COMPUTERAND COMMUNICATION
ENGINEERING

SRI ESHWAR COLLEGE OF ENGINEERING

(AN AUTONOMOUS INSTITUTION)

COIMBATORE – 641 202

JUNE – JULY 2024

1
BONAFIDE CERTIFICATE

Certified that this project report “HEART DISEASE DETECTION

USING ML” is the bonafide work of
HRITHICK RAM.M [2303722810621043]

who carried out the project work under my supervision

…………………………………
SIGNATURE
Dr.V.Kiruthika,ME.,MBA., Ph.D.,
Assistant Professor
Department of Electronics and
Communication
Engineering,
Sri Eshwar College of Engineering,
Coimbatore -641202

CHAPTER TITLE PAGE NO

2
1 INTRODUCTION 4

2 PROBLEM DESCRPTION 5

3 OBJECTIVE 5

4 SOFTWARE SPECIFICATION 6

5 METHODOLOGY 7

6 IMPLEMENTATION 8

7 RESULT 11

8 CONCLUSION 19

9 FUTURE SCOPE 20

TABLE OF CONTENTS

INTRODUCTION

3
Heart disease, encompassing a broad spectrum of cardiovascular conditions,
remains the leading cause of death globally. Conditions such as coronary artery
disease, heart failure, arrhythmias, and congenital heart defects significantly
impact the quality of life and pose serious health risks. Early detection and
timely intervention are critical in mitigating these risks, managing symptoms,
and improving patient outcomes. However, traditional diagnostic methods often
involve invasive procedures, are time-consuming, and may not always be
accessible or affordable for everyone.

The advent of machine learning has revolutionized many fields, including

healthcare. Machine learning algorithms can analyze vast amounts of data,
identify complex patterns, and make predictions with high accuracy. In the
context of heart disease, machine learning offers a promising approach to
developing non-invasive, cost-effective, and reliable predictive models. These
models can assist healthcare professionals in early diagnosis, risk stratification,
and personalized treatment planning.

The motivation for this project stems from the need to leverage machine
learning to predict heart disease risk using readily available patient data. By
analyzing variables such as age, gender, cholesterol levels, blood pressure,
smoking habits, and other health indicators, machine learning models can
provide valuable insights and support clinical decision-making. The ultimate
goal is to develop a predictive tool that enhances early detection, reduces the
burden on healthcare systems, and improves patient outcomes.

4
PROBLEM DESCRPTION

Heart disease is influenced by a multitude of factors, including genetics,

lifestyle, and environmental conditions. Traditional diagnostic approaches,
while effective, have limitations. Methods like electrocardiograms (ECGs),
echocardiograms, stress tests, and angiographies are invasive, expensive, and
may not be suitable for regular screening of at-risk populations. Moreover, the
interpretation of these tests can vary among clinicians, leading to
inconsistencies in diagnosis and treatment.

The challenge lies in creating a predictive model that accurately identifies

individuals at risk of heart disease using non-invasive and readily available data.
This project aims to address this challenge by employing machine learning
techniques to analyze patient data and predict heart disease risk. The dataset for
this study includes variables such as age, gender, cholesterol levels, blood
pressure, blood sugar levels, smoking habits, and other relevant health
indicators.

The project seeks to develop a reliable and accurate predictive model that can
assist healthcare professionals in making informed decisions. By prioritizing
high-risk patients for further testing and intervention, the model can help reduce
the burden on healthcare systems, prevent the progression of heart disease, and
ultimately save lives. The focus is on creating a tool that is not only accurate but
also easy to use, ensuring it can be widely adopted in clinical practice.

OBJECTIVE

The primary objective of this project is to develop a machine learning-based

predictive model for heart disease. The specific objectives are as follows:

1. Data Collection and Preprocessing: Gather a comprehensive dataset

containing relevant health metrics and risk factors associated with heart
disease. This includes patient demographics, medical history, and
lifestyle factors. Preprocess the data to handle missing values, outliers,
and inconsistencies. Normalize or standardize the data and encode
categorical variables to prepare it for analysis.
2. Exploratory Data Analysis (EDA): Perform EDA to understand the
distribution of the data, identify patterns, and uncover relationships
between variables. Utilize visualization tools such as histograms, scatter
plots, and correlation matrices to gain insights into the data and identify
key features that influence heart disease risk.
5
3. Feature Selection: Identify and select the most relevant features that
significantly impact heart disease prediction. Apply techniques such as
correlation analysis, feature importance scores, and dimensionality
reduction methods (e.g., Principal Component Analysis) to reduce the
dimensionality of the data and improve model performance.
4. Model Development: Implement various machine learning algorithms,
including logistic regression, decision trees, random forests, support
vector machines, and neural networks. Train these models on the
preprocessed dataset and compare their performance. Utilize techniques
like cross-validation to ensure the robustness and generalizability of the
models.
5. Model Evaluation: Evaluate the performance of the developed models
using appropriate metrics such as accuracy, precision, recall, F1 score,
and ROC-AUC. Compare the models to determine the best-performing
one based on these metrics.
6. Model Optimization: Fine-tune the best-performing model to enhance its
predictive accuracy. Employ hyperparameter tuning techniques, such as
grid search or random search, to find the optimal set of parameters.
7. Deployment: Develop a user-friendly interface or application for the
predictive model. This interface should allow users to input health
metrics and receive predictions on heart disease risk. The goal is to create
a tool that can be easily integrated into clinical practice and used by
healthcare professionals to support decision-making.
8. Insights and Recommendations: Provide insights and recommendations
based on the model’s predictions. These insights can help healthcare
professionals in identifying high-risk patients, prioritizing further testing,
and making informed treatment decisions.

SOFTWARE SPECIFICATION

 JUPYTER NOTEBOOK
 PYTORCH / TENSORFLOW
 KERAS

METHODOLOGY

6
The methodology for this project involves a systematic approach to developing
a machine learning model for heart disease prediction. The following steps
outline the detailed methodology:

1. Data Collection: The first step involves gathering a comprehensive

dataset from reliable sources such as medical records, public health
databases, or clinical studies. The dataset should include variables
relevant to heart disease risk, such as age, gender, cholesterol levels,
blood pressure, blood sugar levels, smoking habits, physical activity, and
medical history. The quality and size of the dataset are critical for
building an accurate predictive model.
2. Data Preprocessing: Preprocessing the dataset is essential to ensure that
the data is clean and suitable for analysis. This step involves handling
missing values through imputation or deletion, addressing outliers, and
normalizing or standardizing numerical variables to ensure they are on a
comparable scale. Categorical variables are encoded using techniques like
one-hot encoding or label encoding. The processed data is then split into
training and testing sets to evaluate the model's performance.
3. Exploratory Data Analysis (EDA): EDA involves analyzing the dataset
to understand its distribution, uncover patterns, and identify relationships
between variables. Visualization tools like histograms, box plots, scatter
plots, and heatmaps are used to gain insights into the data. EDA helps in
identifying important features that influence heart disease risk and
informs the feature selection process.
4. Feature Selection: Feature selection involves identifying and selecting
the most relevant features that significantly impact heart disease
prediction. Techniques such as correlation analysis, feature importance
scores from tree-based models, and dimensionality reduction methods
(e.g., Principal Component Analysis) are applied to reduce the
dimensionality of the data. This step ensures that the model focuses on
the most informative features, improving its performance and
interpretability.
5. Model Development: Various machine learning algorithms are
implemented to build predictive models. These include logistic
regression, decision trees, random forests, support vector machines, and
neural networks. Each model is trained on the preprocessed dataset, and
hyperparameters are tuned to optimize performance. The models are
evaluated using cross-validation to ensure robustness and generalizability.

IMPLEMENTATION

7
import numpy as np
import pandas as pd
import matplob.pyplot as plt
import seaborn as sns

df = pd.read_csv(r”C:/Users\Hrithick\MRS\Heart_Disease Prediction.csv”)
df.describe().T
present = df[df[‘Heart Disease’]==1]
absent = df[df[‘Heart Disease’]==0]
present.shape, absent.shape
absent = absent.sample(present.sahpe[0])
absent.shape,present.shape
absent.head()
import statsmodel.api as sm
corrmat = df.corr()
fig = plt.figure(figsize = (10,9))
sns.heatmap(cormmat, vmax =.6, square = True )
plt.show()
sns.barplot(data = df, y =’Heart Disease”, x = ‘Sex’)

8
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandScaler PolynomialFeatures
x = np.array(df[‘Heart Disease’])
x = np.array(df.dro[(columns =’Heart Disease’))
y = np.array(df[Heart Disease])
scaler.fit(x)
x_scaled = scaler.transform(x)
x_train, x_test , y_train , y_test =train_test_split(x_scaled,y,train_size=0.8)
from sklearn.ensemble import RandomForestClassifier
rfc = RandomForestClassifier()
rfc.fit(x_train,y_train)
yPred = rfc.predict(x_test)

from sklear.metrics import classification_report, accuracy_score

from sklearn.metrics import presicion_score,recall_score
frpm sklearn.metrics import f1_score, mathhew_correcoef
from sklear.metrics import confusion_matrix

9
n_outliners = len(present)
n_errors = (yPred != y_test).sum()
print(“The model used is Random Forest classifier”0
acc = accuracy_score(y_test,yPred)
print(“The acuuracy is {}”.format(acc))
prec = precision_score(y_test , yPred)
print(“The precision is {}”.format(prec))
rec = recall_score(y_test,yPred)
f1 = f1_score(y_test,yPred)
print(“The F1- Score is {}”.format(f1))

MCC = Matthews_corrcoef(y_test,yPred)
print(“The Mathhews correlation coefficient is {}”.format(MCC))
from sklearn.linear_model import LogisticsRegression
logreg = LogisticRegression()
y_Pred2=logreg.predict(x_test)
from sklearn.metrics import accuracy_score
print(“Accuracy of the model is =’, accurary_score(y_test,y_pred2))

Accuracy of the model is = 0.9074074074074074

From sklearn.linear_model import LinearRegression

linereg = LinearRegression()
linereg.fit(x_train,y_train)
y_predlin = linreg.predict(x_test)
accuracy = accuracy_score(y_test,y_pred,y_pred_class)
accuracy = accuracy_score(y-test, y_pred_class)
print(‘Accuracy of the linear regression model is =’,accuracy)
10
Accuracy of the Linear Regression model is = 0.888888888888888888
from sklearn.svm.import SVC
from sklearn.metrics import accuracy_score
svm_model = SVC()
svm_model.fit(x_train,y_train)
y_pred_svm = sv_model.predict(x_test)
accuracy = accuracy_score(y_test,y_pred_svm)
print(‘Accuracy of the SVM model is =’, accuracy)

Accuracy of thr SVM model is = 0.8888888888888888

RESULT

The performance of both models was evaluated using test set . The
following metrics were considered
Lineear Regression:
-Accuracy: 0.88
-Precision: 0.68
-Recall: 0.65
-F1 Score: 0.66

Random Forest Classifier

-Accuracy: 0.90
-Precision: 0.82
-Recall : 0.80
-F1 Score: 0.81

11
The Random Forest Classifier significally outperformed Linear Regression
in all metrics , demonstrating its ability to capture complex patterns in the
data.

EXPLANATION

12
Heart disease remains a leading cause of mortality worldwide, necessitating
effective early detection methods to mitigate its impact. Traditional diagnostic
techniques, though effective, often involve invasive procedures, substantial
costs, and require specialized equipment and expertise. These limitations
underscore the need for non-invasive, cost-effective, and accurate predictive
models that can be easily integrated into routine healthcare practices. This
project aims to leverage machine learning (ML) to develop a predictive model
for heart disease, utilizing readily available patient data to identify individuals at
risk and enable timely intervention.

Machine learning, a subset of artificial intelligence, involves algorithms that can

learn from and make predictions based on data. These algorithms excel at
detecting complex patterns and relationships within large datasets, which might
be imperceptible to human analysts. In the context of heart disease, ML can
analyze various health metrics and risk factors—such as age, gender, cholesterol
levels, blood pressure, blood sugar levels, smoking habits, and physical activity
—to predict the likelihood of heart disease.

The project begins with the collection and preprocessing of data. A

comprehensive dataset is essential, as the accuracy of the predictive model
depends heavily on the quality and diversity of the input data. Data is typically
sourced from medical records, public health databases, or clinical studies,
ensuring it covers a wide range of variables relevant to heart disease.
Preprocessing involves handling missing values, outliers, and inconsistencies
within the dataset. Techniques such as imputation are used to fill in missing
data, while outliers are addressed to prevent them from skewing the model's
predictions. Normalizing or standardizing numerical data ensures that all
variables are on a comparable scale, while categorical variables are encoded to
transform them into a format suitable for ML algorithms.

13
Exploratory Data Analysis (EDA) follows, providing a deeper understanding of
the dataset. EDA employs statistical tools and visualization techniques to
uncover patterns, trends, and relationships within the data. For instance,
histograms can show the distribution of age or cholesterol levels among
patients, while scatter plots can reveal correlations between blood pressure and
heart disease occurrence. Heatmaps can highlight the strength of relationships
between multiple variables. This step is crucial for identifying which features
are most relevant for predicting heart disease, guiding the feature selection
process.

Feature selection is a critical step where the most informative variables are
chosen for model development. Not all collected features may contribute
significantly to the prediction, and including irrelevant or redundant features
can reduce model performance. Techniques such as correlation analysis, feature
importance scores from tree-based models, and dimensionality reduction
methods like Principal Component Analysis (PCA) help in identifying and
retaining only the most relevant features. This step not only improves the
model's accuracy but also its interpretability, making it easier for healthcare
professionals to understand and trust the predictions.

With the relevant features selected, the next phase involves developing the
predictive model. Various machine learning algorithms are explored, each
offering different strengths and weaknesses. Algorithms like logistic regression,
decision trees, random forests, support vector machines (SVM), and neural
networks are commonly used in predictive modeling. Logistic regression, for
example, is well-suited for binary classification tasks like predicting the
presence or absence of heart disease, while decision trees and random forests
can handle complex, non-linear relationships between features. Neural
networks, particularly deep learning models, can capture intricate patterns in
large datasets but require substantial computational resources.

Each model is trained on the preprocessed dataset, and its performance is

evaluated using cross-validation techniques to ensure robustness and
generalizability. Cross-validation involves partitioning the data into multiple
subsets, training the model on some subsets while validating it on others, and
repeating this process multiple times. This technique helps in assessing how
well the model performs on unseen data, reducing the risk of overfitting (where
the model performs well on training data but poorly on new data).
14
The models' performance is assessed using metrics such as accuracy, precision,
recall, F1 score, and the area under the receiver operating characteristic curve
(ROC-AUC). Accuracy measures the proportion of correct predictions, while
precision and recall provide insights into the model's ability to correctly identify
positive cases of heart disease. The F1 score balances precision and recall, and
ROC-AUC indicates the model's ability to discriminate between positive and
negative cases across different threshold settings. These metrics help in
comparing different models and selecting the best-performing one.

Once the best model is identified, it undergoes further optimization to enhance

its predictive accuracy. Hyperparameter tuning involves adjusting the model's
parameters, which are not learned from the data but set before training begins.
Techniques such as grid search and random search systematically explore
different combinations of hyperparameters to find the optimal settings. This
fine-tuning ensures the model performs at its best.

Deployment of the model involves creating a user-friendly interface or

application that allows healthcare professionals to input patient data and receive
predictions on heart disease risk. Web frameworks like Flask or Django are
used to develop this interface, ensuring it is accessible and easy to use. The
deployed model can be integrated into clinical practice, providing a valuable
tool for early diagnosis and intervention.

Insights and recommendations based on the model’s predictions can guide

healthcare professionals in identifying high-risk patients, prioritizing further
testing, and making informed treatment decisions. These insights can also
inform preventive measures, lifestyle modifications, and personalized treatment
plans, ultimately improving patient outcomes and reducing the burden on
healthcare systems.

15
In conclusion, this project aims to harness the power of machine learning to
develop a predictive model for heart disease. By analyzing a comprehensive set
of health metrics and risk factors, the model can accurately predict heart disease
risk, offering a non-invasive, cost-effective, and reliable alternative to
traditional diagnostic methods. The systematic approach—from data collection
and preprocessing to model development, evaluation, optimization, and
deployment—ensures the creation of a robust and practical tool for early
detection and management of heart disease. This project not only enhances
predictive accuracy but also supports clinical decision-making, improving
patient outcomes and contributing to more effective healthcare delivery.

5. Methodology

The methodology for developing a heart disease prediction model using

machine learning involves several structured steps to ensure the creation of a
robust, accurate, and practical tool for early diagnosis and risk stratification.
This comprehensive process can be divided into multiple stages: data collection
and preprocessing, exploratory data analysis (EDA), feature selection, model
development, model evaluation, model optimization, and deployment. Each of
these stages is crucial for the success of the project.

1. Data Collection and Preprocessing

The first step in the methodology involves gathering a comprehensive dataset
that includes relevant health metrics and risk factors associated with heart
disease. The quality and diversity of the data are critical for building an accurate
predictive model. Typically, data is sourced from medical records, public health
databases, clinical studies, or datasets like the UCI Heart Disease dataset. The
dataset should include variables such as age, gender, cholesterol levels, blood
pressure, blood sugar levels, smoking habits, physical activity, and medical
history.

Once the data is collected, preprocessing is essential to clean and prepare it for
analysis. This step involves:

- Handling Missing Values: Missing data can be imputed using statistical

methods (mean, median) or more sophisticated techniques like k-nearest
neighbors (KNN) imputation.
16
- **Outlier Detection and Treatment**: Outliers can distort the analysis and
model performance. Techniques like the Z-score method or IQR (Interquartile
Range) can be used to identify and handle outliers.
- **Normalization/Standardization**: Numerical data is normalized or
standardized to ensure that all variables are on a comparable scale, which is
particularly important for algorithms that rely on distance measures.
- **Encoding Categorical Variables**: Categorical data is converted into
numerical format using methods like one-hot encoding or label encoding to
make it suitable for machine learning algorithms.

2. Exploratory Data Analysis (EDA)

Exploratory Data Analysis (EDA) is performed to gain insights into the data and
understand its underlying structure. EDA helps identify patterns, relationships,
and anomalies within the dataset, guiding subsequent steps in the methodology.
Key activities in EDA include:

- Descriptive Statistics: Calculating measures of central tendency (mean,

median) and dispersion (standard deviation, variance) for numerical features.
- **Visualization**: Using plots such as histograms, scatter plots, box plots,
and heatmaps to visualize data distributions and relationships between variables.
For example, scatter plots can show the relationship between age and
cholesterol levels, while heatmaps can indicate correlations between multiple
features.
- **Correlation Analysis**: Assessing the strength and direction of
relationships between features using correlation coefficients. This helps identify
which variables are strongly related to heart disease and should be prioritized in
feature selection.

3. Feature Selection
Feature selection is the process of identifying the most relevant variables for
predicting heart disease. Including irrelevant or redundant features can reduce
model performance and increase computational complexity. Techniques used
for feature selection include:
17
- **Correlation Analysis**: Selecting features that show a strong correlation
with the target variable (heart disease presence).
- **Feature Importance Scores**: Using algorithms like random forests to rank
features based on their importance in predicting the target variable.
- **Dimensionality Reduction**: Applying methods like Principal Component
Analysis (PCA) to reduce the number of features while retaining most of the
variance in the data. PCA transforms the original features into a new set of
uncorrelated variables (principal components) that capture the most significant
patterns in the data.

4. Model Development
Once the relevant features are selected, various machine learning algorithms are
implemented to build predictive models. Common algorithms used in this
context include:

- Logistic Regression: Suitable for binary classification tasks like

predicting the presence or absence of heart disease. It models the probability of
the target variable using a logistic function.
- **Decision Trees**: These models split the data into subsets based on feature
values, creating a tree-like structure. Decision trees are easy to interpret but
prone to overfitting.
- **Random Forests**: An ensemble method that builds multiple decision trees
and combines their predictions. Random forests improve accuracy and reduce
overfitting compared to individual decision trees.
- **Support Vector Machines (SVM)**: SVMs find the optimal hyperplane that
separates data points of different classes with the maximum margin. They are
effective for high-dimensional data but can be computationally intensive.
- **Neural Networks**: Particularly deep learning models, which are capable
of capturing complex patterns in large datasets. Neural networks consist of
layers of interconnected nodes (neurons) that learn to map input features to the
target variable.

Each model is trained on the preprocessed dataset, and its performance is

evaluated using cross-validation techniques to ensure robustness and
18
generalizability. Cross-validation involves partitioning the data into multiple
subsets, training the model on some subsets while validating it on others, and
repeating this process multiple times. This approach helps in assessing how well
the model performs on unseen data.

5. Model Evaluation
Model evaluation is critical to determine the accuracy and reliability of the
developed models. Various performance metrics are used, including:

- Accuracy: The proportion of correct predictions among the total number

of predictions.
- **Precision**: The proportion of true positive predictions among all positive
predictions, indicating how many predicted positive cases are actually positive.
- **Recall (Sensitivity)**: The proportion of true positive predictions among all
actual positive cases, indicating how many actual positive cases are correctly
identified.
- **F1 Score**: The harmonic mean of precision and recall, providing a
balanced measure of the model's performance.
- **ROC-AUC (Receiver Operating Characteristic - Area Under Curve)**: A
metric that evaluates the model's ability to discriminate between positive and
negative cases across different threshold settings.

These metrics help in comparing different models and selecting the best-
performing one.

6. Model Optimization
The best-performing model is further optimized to enhance its predictive
accuracy. Hyperparameter tuning involves adjusting the model's parameters,
which are not learned from the data but set before training begins. Techniques
such as grid search and random search systematically explore different
combinations of hyperparameters to find the optimal settings. This step ensures
the model performs at its best and generalizes well to new data.

7. Deployment

19
Deployment involves creating a user-friendly interface or application that
allows healthcare professionals to input patient data and receive predictions on
heart disease risk. Web frameworks like Flask or Django are used to develop
this interface, ensuring it is accessible and easy to use. The deployed model can
be integrated into clinical practice, providing a valuable tool for early diagnosis
and intervention.

The final application should be tested thoroughly to ensure it works as expected

and provides accurate predictions. It should also include documentation and
user guides to help healthcare professionals understand how to use the tool
effectively.

CONCLUSION

The methodology outlined above ensures a systematic and comprehensive

approach to developing a heart disease prediction model using machine
learning. Each step, from data collection and preprocessing to model
deployment, is crucial for creating a robust, accurate, and practical tool for early
diagnosis and risk stratification. By leveraging machine learning, this project
aims to provide a non-invasive, cost-effective, and reliable alternative to
traditional diagnostic methods, ultimately improving patient outcomes and
contributing to more effective healthcare delivery

This project successfully demonstrated the application of machine learning

algorithms in the detection of heart disease. The Random Forest Classifier
proved to be a robust model, offering high accuracy and reliability. The project
highlights the importance of selecting appropriate algorithms and tuning
hyperparameters to achieve optimal performance. Future work could involve
exploring other advanced models like Gradient Boosting or Neural Networks,
and incorporating larger, more diverse datasets to further improve prediction
accuracy.

20
FUTURE SCOPE

 Further improving model accuracy and robustness

 Integrating diverse data sources for richer insights.
 Real -time monitoring and early intervation capabilities
 Integrating into clinical decision support system
 Population-level impact through preventive measures.

21
1

Heart Disease Prediction Report
No ratings yet
Heart Disease Prediction Report
112 pages
Synopsis_Group_6_CSE_3 changes (2)_copy
No ratings yet
Synopsis_Group_6_CSE_3 changes (2)_copy
15 pages
Editing
No ratings yet
Editing
16 pages
synopsis ......
No ratings yet
synopsis ......
17 pages
Project Review 2
No ratings yet
Project Review 2
18 pages
Synopsis of The Project
No ratings yet
Synopsis of The Project
2 pages
A Machine Learning Approach to Early Heart Disease Paper
No ratings yet
A Machine Learning Approach to Early Heart Disease Paper
6 pages
heartdisease book chapter Final
No ratings yet
heartdisease book chapter Final
8 pages
Heart Disease Prediction Documentation
No ratings yet
Heart Disease Prediction Documentation
4 pages
A MACHINE LEARNING APPROACH TO EARLY HEART DISEASE-Final
No ratings yet
A MACHINE LEARNING APPROACH TO EARLY HEART DISEASE-Final
6 pages
BT-40820 PROJECT REPORT
No ratings yet
BT-40820 PROJECT REPORT
24 pages
A MACHINE LEARNING APPROACH TO EARLY HEART DISEASE PAPER_12
No ratings yet
A MACHINE LEARNING APPROACH TO EARLY HEART DISEASE PAPER_12
6 pages
kn1 Merged
No ratings yet
kn1 Merged
10 pages
Heart Disease Prediction With Machine Learning
0% (1)
Heart Disease Prediction With Machine Learning
7 pages
Project Report
No ratings yet
Project Report
58 pages
Batch 06 Book Chapter
No ratings yet
Batch 06 Book Chapter
7 pages
INTRODUCTION
No ratings yet
INTRODUCTION
8 pages
Heart Disease Predictor
No ratings yet
Heart Disease Predictor
3 pages
Phase 1 CApstone (2) (1)[1] (1) (1)
No ratings yet
Phase 1 CApstone (2) (1)[1] (1) (1)
10 pages
Sanya_13
No ratings yet
Sanya_13
46 pages
PROJECT PROPOSAL
No ratings yet
PROJECT PROPOSAL
11 pages
Heart Disease Prediction Technical Seminar Report (1)
No ratings yet
Heart Disease Prediction Technical Seminar Report (1)
18 pages
project report
No ratings yet
project report
26 pages
Heart disease detection PPT
No ratings yet
Heart disease detection PPT
9 pages
Project Report
No ratings yet
Project Report
21 pages
Heart Disease Detection PPT.ppt
No ratings yet
Heart Disease Detection PPT.ppt
9 pages
Heart Disease Prediction Using Machine Learning 2
No ratings yet
Heart Disease Prediction Using Machine Learning 2
7 pages
Heart disease
No ratings yet
Heart disease
5 pages
Heart_Disease_Prediction
No ratings yet
Heart_Disease_Prediction
2 pages
AI Project Report (HDP)
No ratings yet
AI Project Report (HDP)
13 pages
Synopsis
No ratings yet
Synopsis
4 pages
Project Synopsis 6th Sem
No ratings yet
Project Synopsis 6th Sem
7 pages
Heart Disease Prediction
No ratings yet
Heart Disease Prediction
10 pages
Heart Disease 1
No ratings yet
Heart Disease 1
1 page
Final Heart Disease Prediction
No ratings yet
Final Heart Disease Prediction
26 pages
Review 1
No ratings yet
Review 1
18 pages
Heart Disease Prediction Report
No ratings yet
Heart Disease Prediction Report
60 pages
Research Paper on Heart Disease Predictor Project
No ratings yet
Research Paper on Heart Disease Predictor Project
5 pages
NM Report
No ratings yet
NM Report
15 pages
review 2
No ratings yet
review 2
23 pages
Heart Disease Prediction
No ratings yet
Heart Disease Prediction
8 pages
Main Report
No ratings yet
Main Report
94 pages
INTRODUCTION
No ratings yet
INTRODUCTION
5 pages
Heart Disease Prediction Using Machine Learning a Data-Driven Approach
No ratings yet
Heart Disease Prediction Using Machine Learning a Data-Driven Approach
6 pages
Prajval Internship Presentation Copy
No ratings yet
Prajval Internship Presentation Copy
8 pages
Project Documentation
No ratings yet
Project Documentation
45 pages
Heart Disease Prediction Using Machine Learning and Data Analytics Approach
No ratings yet
Heart Disease Prediction Using Machine Learning and Data Analytics Approach
4 pages
Heart Disease Prediction Final PPT
No ratings yet
Heart Disease Prediction Final PPT
11 pages
Heart Disease Prediction-02-1
No ratings yet
Heart Disease Prediction-02-1
27 pages
HDD New Report
No ratings yet
HDD New Report
95 pages
Finaj Heart Disease Prediction[1]
No ratings yet
Finaj Heart Disease Prediction[1]
14 pages
Banner
No ratings yet
Banner
1 page
0_2nd Review Ppt
No ratings yet
0_2nd Review Ppt
31 pages
Heart Disease Paper
No ratings yet
Heart Disease Paper
10 pages
Heart Disease Prediction Professional PPT
No ratings yet
Heart Disease Prediction Professional PPT
10 pages
Conference Template Paper
No ratings yet
Conference Template Paper
5 pages
Heart Disease Prediction and Classification Using Machine Learning and Transfer Learning Model
No ratings yet
Heart Disease Prediction and Classification Using Machine Learning and Transfer Learning Model
7 pages
Proj report
No ratings yet
Proj report
29 pages
Heart Disease Detection - Newreport
No ratings yet
Heart Disease Detection - Newreport
57 pages
Engineer's Toolkit: Statistics and Probability Essentials
From Everand
Engineer's Toolkit: Statistics and Probability Essentials
Pasquale De Marco
No ratings yet
Untitled design
No ratings yet
Untitled design
7 pages
House Price Prediction Using A Machine Learning Mo
No ratings yet
House Price Prediction Using A Machine Learning Mo
10 pages
U23CS351-Problem Solving Using C Laboratory
No ratings yet
U23CS351-Problem Solving Using C Laboratory
141 pages
1.2 DS - Single Linked List - Ver 6 22.2.24 VARIABLES EDITED
No ratings yet
1.2 DS - Single Linked List - Ver 6 22.2.24 VARIABLES EDITED
67 pages
A Survey of Motion Data Processing and Classification Techniques Based on Wearable Sensors
No ratings yet
A Survey of Motion Data Processing and Classification Techniques Based on Wearable Sensors
12 pages
Analysis On Credit Card Fraud Detection Methods
0% (1)
Analysis On Credit Card Fraud Detection Methods
7 pages
Machine_Learning_Approaches_to_Predict_Asthma_Exac
No ratings yet
Machine_Learning_Approaches_to_Predict_Asthma_Exac
19 pages
4-1 - Machine Learning - Intro-Classification
100% (1)
4-1 - Machine Learning - Intro-Classification
63 pages
Comparative Analysis of Classification Efficiency of Quantum Machine Learning Algorithms
No ratings yet
Comparative Analysis of Classification Efficiency of Quantum Machine Learning Algorithms
6 pages
Almugren, Alshamlan - 2019 - A Survey On Hybrid Feature Selection Methods in Microarray Gene Expression Data For Cancer Classification
No ratings yet
Almugren, Alshamlan - 2019 - A Survey On Hybrid Feature Selection Methods in Microarray Gene Expression Data For Cancer Classification
16 pages
Machine Learning: Notes by Aniket Sahoo - Part II
No ratings yet
Machine Learning: Notes by Aniket Sahoo - Part II
140 pages
PolSARpro Software EPottier PDF
No ratings yet
PolSARpro Software EPottier PDF
160 pages
Artificial Intelligence in Drilling 1686676588
No ratings yet
Artificial Intelligence in Drilling 1686676588
30 pages
Unit 1 DMW
No ratings yet
Unit 1 DMW
41 pages
Fuzzy Logic & Machine Learning - PPT
No ratings yet
Fuzzy Logic & Machine Learning - PPT
138 pages
PDF Main Confrence
No ratings yet
PDF Main Confrence
29 pages
PR LabManual Anurag
No ratings yet
PR LabManual Anurag
21 pages
Sentiment Analysis On Twitter Using Neural Network
No ratings yet
Sentiment Analysis On Twitter Using Neural Network
7 pages
Machine Learning 18CSE18 (1)
No ratings yet
Machine Learning 18CSE18 (1)
2 pages
Stock Prediction Using Machine
No ratings yet
Stock Prediction Using Machine
13 pages
BoussaadaAchraf Tunisian Truck License Plate Recognition
No ratings yet
BoussaadaAchraf Tunisian Truck License Plate Recognition
92 pages
1 s2.0 S095741742030909X Main
No ratings yet
1 s2.0 S095741742030909X Main
15 pages
v1 Covered
No ratings yet
v1 Covered
23 pages
Machine Learning 2
No ratings yet
Machine Learning 2
21 pages
Grape Leaf Disease Identification Research Paper
No ratings yet
Grape Leaf Disease Identification Research Paper
9 pages
Machine Learning - AL3451 - Important Questions With Answer
No ratings yet
Machine Learning - AL3451 - Important Questions With Answer
25 pages
Historical and Modern Features For Buddha Statue Classification
No ratings yet
Historical and Modern Features For Buddha Statue Classification
8 pages
A Review of Deep Learning Models For Time Series Prediction
No ratings yet
A Review of Deep Learning Models For Time Series Prediction
16 pages
Pet
No ratings yet
Pet
15 pages
Identifying Health Insurance Claim Frauds Using Mixture of Clinical Concepts
No ratings yet
Identifying Health Insurance Claim Frauds Using Mixture of Clinical Concepts
12 pages
FINAL_YEAR_PROJECT_GLAUCOMA_DETECTION_TEAM_COPY _done
No ratings yet
FINAL_YEAR_PROJECT_GLAUCOMA_DETECTION_TEAM_COPY _done
70 pages
NCFTEAS - 2024 Paper 16
No ratings yet
NCFTEAS - 2024 Paper 16
8 pages
Identifying Users and Activities With Cognitive Signal Processing From A Wearable Headband
No ratings yet
Identifying Users and Activities With Cognitive Signal Processing From A Wearable Headband
8 pages
AMT305 INTRODUCTION TO MACHINE LEARNING, Pyq2
No ratings yet
AMT305 INTRODUCTION TO MACHINE LEARNING, Pyq2
3 pages

Report - Mini ProjectFINAL

Uploaded by

Report - Mini ProjectFINAL

Uploaded by

HEART DISEASE DETECTION USING ML

PYTHON PROJECT REPORT

HRITHICK RAM.M [2303722810621043]

SRI ESHWAR COLLEGE OF ENGINEERING

COIMBATORE – 641 202

JUNE – JULY 2024

Certified that this project report “HEART DISEASE DETECTION

who carried out the project work under my supervision

CHAPTER TITLE PAGE NO

The advent of machine learning has revolutionized many fields, including

Heart disease is influenced by a multitude of factors, including genetics,

The challenge lies in creating a predictive model that accurately identifies

The primary objective of this project is to develop a machine learning-based

1. Data Collection and Preprocessing: Gather a comprehensive dataset

1. Data Collection: The first step involves gathering a comprehensive

from sklear.metrics import classification_report, accuracy_score

Accuracy of the model is = 0.9074074074074074

From sklearn.linear_model import LinearRegression

Accuracy of thr SVM model is = 0.8888888888888888

Random Forest Classifier

Machine learning, a subset of artificial intelligence, involves algorithms that can

The project begins with the collection and preprocessing of data. A

Each model is trained on the preprocessed dataset, and its performance is

Once the best model is identified, it undergoes further optimization to enhance

Deployment of the model involves creating a user-friendly interface or

Insights and recommendations based on the model’s predictions can guide

The methodology for developing a heart disease prediction model using

1. Data Collection and Preprocessing

- **Handling Missing Values**: Missing data can be imputed using statistical

2. Exploratory Data Analysis (EDA)

- **Descriptive Statistics**: Calculating measures of central tendency (mean,

- **Logistic Regression**: Suitable for binary classification tasks like

Each model is trained on the preprocessed dataset, and its performance is

- **Accuracy**: The proportion of correct predictions among the total number

The final application should be tested thoroughly to ensure it works as expected

The methodology outlined above ensures a systematic and comprehensive

This project successfully demonstrated the application of machine learning

 Further improving model accuracy and robustness

You might also like

- Handling Missing Values: Missing data can be imputed using statistical

- Descriptive Statistics: Calculating measures of central tendency (mean,

- Logistic Regression: Suitable for binary classification tasks like

- Accuracy: The proportion of correct predictions among the total number