Dataset

The document outlines a project on diabetes prediction using a dataset from the Pima Indians Diabetes Database. It discusses the importance of diabetes, the objectives of the project, data cleaning, exploration, feature engineering, and predictive modeling techniques employed. The aim is to predict diabetes presence based on various medical measurements and to identify key features indicative of the disease.

Uploaded by

jeylan2045

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

3 views13 pages

Dataset

Uploaded by

jeylan2045

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 13

1

ARSI UNIVERSITY
COLLGE OF BUSINSS AND ECONOMICS
DEPARTEMENT OF MANAGEMENT INFORMATION SYSTEM
PROJECT TITLE: DATA SCIENCE ON DIABETES PREDICTIONM
DATASET
Presented by: Group A student
2
INRODUCTION

 According to WHO, Diabetes is a chronic disease that occurs either when the
pancreas does not produce enough insulin or when the body cannot
effectively use the insulin it produces.
 Insulin is a hormone that regulates blood sugar.
 Hyperglycemia, or raised blood sugar, is a common effect of uncontrolled
diabetes and over time leads to serious damage to many of the body's
systems, especially the nerves and blood vessels.
3
CON…..

 Diabetes is a health condition that affects how your body turns food into energy.
 Most of the food you eat is broken down into sugar (also called glucose) and
released into your bloodstream.
 When your blood sugar goes up, it signals your pancreas to release insulin.
 Without ongoing, careful management, diabetes can lead to a buildup of sugars in
the blood, which can increase the risk of dangerous complications, including stroke
and heart disease.
 So that I decide to predict using Machine Learning in Python
4
Problem Statement/business understanding

 Diabetes dataset is to diagnostically predict whether or not a patient has diabetes, based on certain
diagnostic measurements included in the dataset.
 Several constraints were placed on the selection of these instances from a larger database. In particular, all
patients here are females at least 21 years old of Pima Indian Heritage.
 To know the impact of Pregnancies, Glucose, Blood Pressure, Skin Thickness, Insulin, BMI and Diabetes
Pedigree Function based on available data.
 Based on regression analysis we predict the relationship between dependent variable(diabetes) and
independent variable (pregnancy, glucose, age, BMI....
5
Objectives

 Predict if person is diabetes patient or not

 To experiment with different classification methods to see which yields
the highest accuracy
 Classify whether someone has diabetes or not from given features
 To determine which features are the most indicative of diabetes
6
Data mining

 Data Set the dataset collected is originally from the Pima Indians
Diabetes Database is available on Kaggle.
 It consists of several medical analyst variables and one target variable.
 The objective of the dataset is to predict whether the patient has diabetes or not.
 The dataset consists of several independent variables and one dependent
variable.
7
CON…..

 Independent variables include the number of pregnancies the patient has

had their BMI,insulin level, age, and
 In this project i used Pima Indians Diabetes Database from Kaggle.
 This dataset is originally from the National Institute of Diabetes and
Digestive and Kidney Diseases.
8
Data Cleaning

 We saw on df.head() that some features contain 0, it doesn't make sense here
and this indicates missing value Below we replace 0 value by Null:
 This part contain cleaning and preparing the data
 Under this Fix the inconsistencies within the data, handle missing values,
and treat data with principles of collinearity
 We observed that there is no missing values in dataset however the features
like Glucose, BloodPressure, Insulin, SkinThickness has 0 values which is
not possible.
9
Data Exploration

 This stage is all about building a model that best solves your problem.
 This stage always begins with a process called Data Splicing, where you split
your entire data set into two proportions.
 One for training the model (training data set) and the other for testing the
efficiency of the model (testing data set).
 This is followed by building the model by using the training data set and finally
evaluating the model by using the test data set.
10
Feature Engineering

Under feature engineering we use feature selection we select data train data and test
data
 Now, it’s time to add important features to the dataset discover some effective
features before fitting it into machine learning models
11
Predictive modeling

 In our proposed predictive model we have done pre- processing of raw data and
different feature engineering techniques to get better results.
 Algorithm is used for feature selection as it provides unbiased selection of
important features and unimportant features from an information system.
 Training of raw data after feature engineering has a significant role in supervised
learning.
 We have used highly correlated variables for better outcomes.
 Input data, here indicates to test data used for predict and confusion matrix
8.Data Visualization 12

 Visualize our data using Python notebook like Jupiter by

using interactive libraries and plotting different graphs
13

THANK
YOU FOR
YOUR
ATTENTI
ON !!

Projectreport Diabetes Prediction
No ratings yet
Projectreport Diabetes Prediction
25 pages
مختار النعيري - The Course Work Submission (1)
No ratings yet
مختار النعيري - The Course Work Submission (1)
31 pages
Afroz Content
No ratings yet
Afroz Content
24 pages
Diabe.pdf
No ratings yet
Diabe.pdf
11 pages
Diabetes Classification Report
No ratings yet
Diabetes Classification Report
17 pages
Chapter Three 111
No ratings yet
Chapter Three 111
13 pages
Sample INTERNSHIP Report
No ratings yet
Sample INTERNSHIP Report
32 pages
Checklist For First - Aid Document NO. VNC-KEC/SHE/FA/40 REV. 01 Page No. 1
No ratings yet
Checklist For First - Aid Document NO. VNC-KEC/SHE/FA/40 REV. 01 Page No. 1
1 page
Major Project Report 2023-2024
No ratings yet
Major Project Report 2023-2024
33 pages
Kush Don FINAL Jatu
No ratings yet
Kush Don FINAL Jatu
11 pages
final seminar report soumya
No ratings yet
final seminar report soumya
20 pages
20BCE7620 AP2021228000397 Experiment-6 Removed
No ratings yet
20BCE7620 AP2021228000397 Experiment-6 Removed
19 pages
Machine Learning and Deep Learning Techniques
No ratings yet
Machine Learning and Deep Learning Techniques
13 pages
FRTemplate Software
No ratings yet
FRTemplate Software
50 pages
241410
No ratings yet
241410
10 pages
DT 444
No ratings yet
DT 444
19 pages
Projectreport Diabetes Prediction
No ratings yet
Projectreport Diabetes Prediction
22 pages
DIABETES
No ratings yet
DIABETES
33 pages
CIEA_Term_Project
No ratings yet
CIEA_Term_Project
19 pages
Diabetes Prediciton Model
100% (1)
Diabetes Prediciton Model
23 pages
Diabetes Prediction Using Machine Learning KNN - Algorithm Technique
No ratings yet
Diabetes Prediction Using Machine Learning KNN - Algorithm Technique
4 pages
ppt715B.pptm (Autosaved)
No ratings yet
ppt715B.pptm (Autosaved)
15 pages
Binod ML Project-052
No ratings yet
Binod ML Project-052
14 pages
Classification
No ratings yet
Classification
9 pages
Cs Batchno19
No ratings yet
Cs Batchno19
53 pages
Diabetes Prediction
No ratings yet
Diabetes Prediction
13 pages
IPL Winning Prediction Intern Report
No ratings yet
IPL Winning Prediction Intern Report
52 pages
mlPPT_11_45
No ratings yet
mlPPT_11_45
31 pages
Predicting Diabetes in Medical Datasets Using Machine Learning Techniques
No ratings yet
Predicting Diabetes in Medical Datasets Using Machine Learning Techniques
14 pages
diabetes_test report
No ratings yet
diabetes_test report
62 pages
Risab
No ratings yet
Risab
13 pages
DSPYProjectReport(1) (1)
No ratings yet
DSPYProjectReport(1) (1)
14 pages
Slide Presetatio
No ratings yet
Slide Presetatio
30 pages
final PPT
No ratings yet
final PPT
44 pages
Classifier Model For Diabetes Prediction
No ratings yet
Classifier Model For Diabetes Prediction
30 pages
Diabetes Project MuskanAltaf
No ratings yet
Diabetes Project MuskanAltaf
15 pages
Poster Template
No ratings yet
Poster Template
1 page
Estimating diabetic risk accurately(ppt)
No ratings yet
Estimating diabetic risk accurately(ppt)
26 pages
Ek125 Final Project
No ratings yet
Ek125 Final Project
13 pages
Project Report
No ratings yet
Project Report
10 pages
TechnologyName_phase1
No ratings yet
TechnologyName_phase1
9 pages
A Mini Skill Based Project Report On: Machine Learning & Optimization (270404)
No ratings yet
A Mini Skill Based Project Report On: Machine Learning & Optimization (270404)
20 pages
Independent Project
No ratings yet
Independent Project
10 pages
Automated payroll management system
No ratings yet
Automated payroll management system
4 pages
gautam[1]
No ratings yet
gautam[1]
7 pages
Diabetes Prediction - ML
No ratings yet
Diabetes Prediction - ML
29 pages
Ai Datascience Project Grade 10
No ratings yet
Ai Datascience Project Grade 10
14 pages
Ads exp 10
No ratings yet
Ads exp 10
10 pages
Adikavi Nannaya University: University College of Engineering
No ratings yet
Adikavi Nannaya University: University College of Engineering
13 pages
Synopsis Diabetes Pred System ML
No ratings yet
Synopsis Diabetes Pred System ML
9 pages
DIAPRO - Diabetes Prediction Application
No ratings yet
DIAPRO - Diabetes Prediction Application
18 pages
Mini Project
No ratings yet
Mini Project
15 pages
Ijs DR 2205103
No ratings yet
Ijs DR 2205103
4 pages
Data Science Project Ideas, Methodology & Python Codes in Health Care
From Everand
Data Science Project Ideas, Methodology & Python Codes in Health Care
Zemelak Goraga
No ratings yet
ML Minor May
No ratings yet
ML Minor May
5 pages
Diabetes Prediction Report
No ratings yet
Diabetes Prediction Report
16 pages
Diabetes PPT
100% (1)
Diabetes PPT
9 pages
Aiml Project Report
No ratings yet
Aiml Project Report
10 pages
Diabetes Pridiction Using Machine Learning
No ratings yet
Diabetes Pridiction Using Machine Learning
31 pages
Hyper-Personalized Healthcare: The Future of Medicine
From Everand
Hyper-Personalized Healthcare: The Future of Medicine
Carlos Alves
No ratings yet
(eBook PDF) Health Education: Elementary and Middle School Applications 9th Edition - Own the ebook now with all fully detailed content
100% (1)
(eBook PDF) Health Education: Elementary and Middle School Applications 9th Edition - Own the ebook now with all fully detailed content
43 pages
RT 121 - Medical Terminology 1
100% (1)
RT 121 - Medical Terminology 1
6 pages
Narrative Report Sports
100% (1)
Narrative Report Sports
2 pages
54 Batch Project Documentation-1
No ratings yet
54 Batch Project Documentation-1
82 pages
Health Education Book
No ratings yet
Health Education Book
46 pages
Principles of Fitness Training
No ratings yet
Principles of Fitness Training
19 pages
DISA - General Medical Clearance Form
No ratings yet
DISA - General Medical Clearance Form
2 pages
Bupa - Member welcome booklet
No ratings yet
Bupa - Member welcome booklet
19 pages
Linx Cleaning Fluid 0105-English UK ROI
No ratings yet
Linx Cleaning Fluid 0105-English UK ROI
11 pages
Critical Thinking 11
No ratings yet
Critical Thinking 11
1 page
Pre-Week Judge Gito
No ratings yet
Pre-Week Judge Gito
38 pages
Organ Sales Will Save Lives
No ratings yet
Organ Sales Will Save Lives
4 pages
Linear Goal Programming
No ratings yet
Linear Goal Programming
7 pages
Root Cause Analysis
No ratings yet
Root Cause Analysis
5 pages
Compare and Contrast Atropine and Glycopyrulate, and Discuss The Clinical Implications
No ratings yet
Compare and Contrast Atropine and Glycopyrulate, and Discuss The Clinical Implications
2 pages
Safety Plan: An Instruction Manual
No ratings yet
Safety Plan: An Instruction Manual
3 pages
Annotated Bibliography
No ratings yet
Annotated Bibliography
11 pages
Food Chemistry: Li-Chen Wu, Hsiu-Wen Hsu, Yun-Chen Chen, Chih-Chung Chiu, Yu-In Lin, Ja-An Annie Ho
No ratings yet
Food Chemistry: Li-Chen Wu, Hsiu-Wen Hsu, Yun-Chen Chen, Chih-Chung Chiu, Yu-In Lin, Ja-An Annie Ho
9 pages
Ocular Effects of Chronic Exposure To Welding Light On Calabar Welders K. G. Davies U. Asana C. O. Nku and E. E. OSIM
No ratings yet
Ocular Effects of Chronic Exposure To Welding Light On Calabar Welders K. G. Davies U. Asana C. O. Nku and E. E. OSIM
4 pages
Abg Analysis
100% (11)
Abg Analysis
26 pages
ADA Diabetic Nephropathy
No ratings yet
ADA Diabetic Nephropathy
5 pages
North Carolina at A Glance Brochure
No ratings yet
North Carolina at A Glance Brochure
2 pages
Dhs Ied Search Procedures
No ratings yet
Dhs Ied Search Procedures
72 pages
CBSE Class 12 Biology Question Papers SA1 2010 1 PDF
No ratings yet
CBSE Class 12 Biology Question Papers SA1 2010 1 PDF
4 pages
EHS or Safety Director or Operations Manager or Environmental Di
No ratings yet
EHS or Safety Director or Operations Manager or Environmental Di
3 pages
Syllabus Level 1 PDF
No ratings yet
Syllabus Level 1 PDF
6 pages
Arterial Puncture Checklist
100% (1)
Arterial Puncture Checklist
2 pages
Dr. Lal Pathlabs Ltd. 117/H-2/168-A, Pandu Nagar Kanpur - 208005, U.P
No ratings yet
Dr. Lal Pathlabs Ltd. 117/H-2/168-A, Pandu Nagar Kanpur - 208005, U.P
2 pages
The Shulgin Index, Volume One: Psychedelic Phenethylamines and Related Compounds
100% (7)
The Shulgin Index, Volume One: Psychedelic Phenethylamines and Related Compounds
839 pages
20 Week Developmental Program PDF
100% (8)
20 Week Developmental Program PDF
154 pages

Dataset

Uploaded by

Dataset

Uploaded by

1

 Predict if person is diabetes patient or not

 Independent variables include the number of pregnancies the patient has

 Visualize our data using Python notebook like Jupiter by

You might also like