0% found this document useful (0 votes)

0 views10 pages

Machine

Uploaded by

chatikuto

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

0 views10 pages

Machine

Uploaded by

chatikuto

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 10

DESIGN AND ANALYSIS OF

ALGORITHM

Computer Science & Engineering (Computer Science Eng.)

Govt. Engineering college, Ajmer

Type your text (Session 2024-25)

SUBMITTED TO: SUBMITTED BY:

Ms sakshi jain Suryakant Acharya
CB 2
23CS138D

Department of Computer Science & Engineering (CSE)

Govt. Engineering college, Ajmer
What is ML?
Machine Learning is the field of study that gives computers the

capability to learn without being explicitly programmed. ML is

one of the most exciting technologies that one would have ever

come across. As it is evident from the name, it gives the

computer that makes it more similar to humans: The ability to

learn

What is Exploratory Data Analysis (EDA)?

Exploratory Data Analysis (EDA) is a crucial initial step in data science
projects. It involves analyzing and visualizing data to understand its key
characteristics, uncover patterns, and identify relationships between
variables refers to the method of studying and exploring record sets to
apprehend their predominant traits, discover patterns, locate outliers, and
identify relationships between variables. EDA is normally carried out as a
preliminary step before undertaking extra formal statistical analyses or
modeling
Key aspects of EDA include:
 Distribution of Data: Examining the distribution of data points to
understand their range, central tendencies (mean, median), and dispersion
(variance, standard deviation).
 Graphical Representations: Utilizing charts such as histograms, box plots,
scatter plots, and bar charts to visualize relationships within the data and
distributions of variables.
 Outlier Detection: Identifying unusual values that deviate from other data
points. Outliers can influence statistical analyses and might indicate data
entry errors or unique cases.
 Correlation Analysis: Checking the relationships between variables to
understand how they might affect each other. This includes computing
correlation coefficients and creating correlation matrices.
 Handling Missing Values: Detecting and deciding how to address missing
data points, whether by imputation or removal, depending on their impact
and the amount of missing data.
 Summary Statistics: Calculating key statistics that provide insight into data
trends and nuances.
 Testing Assumptions: Many statistical tests and models assume the data
meet certain conditions (like normality or homoscedasticity). EDA helps
verify these assumptions.
IMPLEMENTATION:

 Libraries like “pandas”, “matplotlib” are imported to use inbuilt functions

to work on the dataset.

 Using the mount() function in Google Colab allows any code in the
notebook to access any file in Google Drive.
 The data set is then read and printed.

 df.head(): This method returns the first 5 rows of the DataFrame by

default.
 shape(): shape will show how many features (columns) and observations
(rows) there are in the dataset.
 info() facilitates comprehension of the data type and related information,
such as the quantity of records in each column, whether the data is null or
not, the type of data, and the dataset’s memory use.

 df.describe(), which gives the count, mean, standard deviation, minimum,

and quartiles for each numerical column. The dataset’s central tendencies
and spread are briefly summarized.

 df.columns.tolist() converts the column names of the DataFrame ‘df’ into

a Python list, providing a convenient way to access and manipulate column
names
 df.isnull().sum() checks for missing values in each column of the
DataFrame ‘df’ and returns the sum of null values for each column

 df.nunique() determines how many unique values there are in each column
of the DataFrame “df,” offering information about the variety of data that
makes up each feature.
 Here , this count plot graph shows the count of the species with its count.
 Here, in the kernel density plot is about the skewness of the of the
corresponding feature. The features in this dataset that have skewness are
exactly 0 depicts the symmetrical distribution and the plots with skewness
1 or above 1 is positively or right skewd distribution. In right skewd or
positively skewed distribution if the tail is more on the right side, that
indicates extremely high values.
 This graph shows the swarm plot for ‘Petal width’ and ‘Species’ column.
This plot depicts that the higher point density in specific regions shows the
concentration indicating where the majority of data points cluster. The
points isolated and are far away from the clusters shows the outliers.

214.4-21 Preview
100% (1)
214.4-21 Preview
5 pages
FS PLM 111 0008
No ratings yet
FS PLM 111 0008
14 pages
Chapter 2. Data Analysis and Processing - Full
No ratings yet
Chapter 2. Data Analysis and Processing - Full
49 pages
DSP UNIT - II
No ratings yet
DSP UNIT - II
14 pages
Unit 1 - Intro To EDA
No ratings yet
Unit 1 - Intro To EDA
40 pages
EXP-12
No ratings yet
EXP-12
4 pages
PDF_Experiments-1_DADV
No ratings yet
PDF_Experiments-1_DADV
41 pages
IOT-Domain Analyst
No ratings yet
IOT-Domain Analyst
11 pages
Practical No.-01
No ratings yet
Practical No.-01
25 pages
Mini Project Report On
No ratings yet
Mini Project Report On
17 pages
‏لقطة شاشة ٢٠٢٤-٠٥-٠٧ في ٧.٢٧.١٤ م
No ratings yet
‏لقطة شاشة ٢٠٢٤-٠٥-٠٧ في ٧.٢٧.١٤ م
12 pages
CS202 Assignment - 4- GIKI
No ratings yet
CS202 Assignment - 4- GIKI
3 pages
Dev Answer Key
No ratings yet
Dev Answer Key
21 pages
Exp-12
No ratings yet
Exp-12
7 pages
Dav Exps - Merged - Merged
No ratings yet
Dav Exps - Merged - Merged
99 pages
Lesson 5 Exploratory Data Analysis
No ratings yet
Lesson 5 Exploratory Data Analysis
10 pages
Unit - Iii - Eda
No ratings yet
Unit - Iii - Eda
25 pages
UNIT 1
No ratings yet
UNIT 1
23 pages
UNIT-2
No ratings yet
UNIT-2
36 pages
Unit 3
No ratings yet
Unit 3
222 pages
Learneverythingai
No ratings yet
Learneverythingai
9 pages
Ccs346 Eda Unit 1
No ratings yet
Ccs346 Eda Unit 1
139 pages
Exploratory Data Analysis Using Python
No ratings yet
Exploratory Data Analysis Using Python
7 pages
datascience 3
No ratings yet
datascience 3
40 pages
Exploratory Data Analysis-1
No ratings yet
Exploratory Data Analysis-1
10 pages
Exploratory Data Analysis: by Neha Mathur
No ratings yet
Exploratory Data Analysis: by Neha Mathur
14 pages
Document (4)
No ratings yet
Document (4)
21 pages
FDS Unit 2
No ratings yet
FDS Unit 2
15 pages
Total Documentation
No ratings yet
Total Documentation
21 pages
Module 2
No ratings yet
Module 2
30 pages
Exploratory Data Analysis: by Neha Mathur
No ratings yet
Exploratory Data Analysis: by Neha Mathur
14 pages
Data Science in Society Cat
No ratings yet
Data Science in Society Cat
5 pages
Data Analysis Lab - Final - 23-24
No ratings yet
Data Analysis Lab - Final - 23-24
11 pages
final dev record
No ratings yet
final dev record
49 pages
AIDS C04-Session-22
No ratings yet
AIDS C04-Session-22
22 pages
Aids - 21ad62 - Datascience Lab Manual-1
No ratings yet
Aids - 21ad62 - Datascience Lab Manual-1
15 pages
UNIT - 1 EDA Continuation
No ratings yet
UNIT - 1 EDA Continuation
113 pages
unit 6
No ratings yet
unit 6
3 pages
DE&V TWO MARKS QUESTIONS WITH ANSWERS
No ratings yet
DE&V TWO MARKS QUESTIONS WITH ANSWERS
19 pages
Data Exploration Preparation
No ratings yet
Data Exploration Preparation
12 pages
Fundamentals of Data Science Students
No ratings yet
Fundamentals of Data Science Students
52 pages
DEV_CORE
No ratings yet
DEV_CORE
7 pages
Eda
No ratings yet
Eda
4 pages
AUTOMATED EDA Libraries
No ratings yet
AUTOMATED EDA Libraries
12 pages
04 DS 2023
No ratings yet
04 DS 2023
63 pages
Machine Learning
No ratings yet
Machine Learning
30 pages
DSV-M2-lecture notes -valar new
No ratings yet
DSV-M2-lecture notes -valar new
37 pages
DATA_ANALYTICS_LAB_MANUAL_FINAL1[1]
No ratings yet
DATA_ANALYTICS_LAB_MANUAL_FINAL1[1]
32 pages
EDA On Titanic Dataset
100% (1)
EDA On Titanic Dataset
39 pages
827b551be7606030c4c1ca693fb54a0ed875
No ratings yet
827b551be7606030c4c1ca693fb54a0ed875
12 pages
MACHINE LEARNING LAB WORD 12-1-2025. DOCUMENT
No ratings yet
MACHINE LEARNING LAB WORD 12-1-2025. DOCUMENT
68 pages
DL_EDA_process
No ratings yet
DL_EDA_process
2 pages
Data Sciecnce
No ratings yet
Data Sciecnce
16 pages
DEV RECORD AIDS
No ratings yet
DEV RECORD AIDS
24 pages
Unit 2
No ratings yet
Unit 2
58 pages
Machine Learning Project Roadmap
No ratings yet
Machine Learning Project Roadmap
4 pages
DEV LAB MANUAL
No ratings yet
DEV LAB MANUAL
35 pages
devish all unit
No ratings yet
devish all unit
42 pages
Lab07ML - f40
No ratings yet
Lab07ML - f40
13 pages
DAL EXT 1 and 2
No ratings yet
DAL EXT 1 and 2
125 pages
Linear Regression Merged
No ratings yet
Linear Regression Merged
38 pages
Mastering Data Structures and Algorithms in C and C++
From Everand
Mastering Data Structures and Algorithms in C and C++
Sachin Naha
No ratings yet
Cired 2019 - 995
No ratings yet
Cired 2019 - 995
5 pages
ThesisRickFakkert S2728478 MSCSCM
No ratings yet
ThesisRickFakkert S2728478 MSCSCM
38 pages
Chapter 2 DESCRIPTIVE ANALYTICS
No ratings yet
Chapter 2 DESCRIPTIVE ANALYTICS
86 pages
C995 18 PDF
No ratings yet
C995 18 PDF
14 pages
FIDAL_statistics
No ratings yet
FIDAL_statistics
3 pages
Artificial Neural Networks For Earthquake Prediction Using Times Series Magnitude Data or Seismic Electric Signals
No ratings yet
Artificial Neural Networks For Earthquake Prediction Using Times Series Magnitude Data or Seismic Electric Signals
8 pages
Instant download A Course in Time Series Analysis 1st Edition Pena D. pdf all chapter
No ratings yet
Instant download A Course in Time Series Analysis 1st Edition Pena D. pdf all chapter
77 pages
Outlier Detection: Univariate and Multivariate
No ratings yet
Outlier Detection: Univariate and Multivariate
13 pages
Cleanroom Performance Testing Specifications - Bio-Medical Pharmaceutical
No ratings yet
Cleanroom Performance Testing Specifications - Bio-Medical Pharmaceutical
18 pages
Suggested Answers (Chapter 6)
No ratings yet
Suggested Answers (Chapter 6)
3 pages
Surpac Minex Group Geostatistics in Surp PDF
No ratings yet
Surpac Minex Group Geostatistics in Surp PDF
116 pages
One-Way ANOVA Step-by-Step JASP Guide
No ratings yet
One-Way ANOVA Step-by-Step JASP Guide
21 pages
Chapter 3
No ratings yet
Chapter 3
33 pages
Detecting Data Outliers
No ratings yet
Detecting Data Outliers
7 pages
AP Statistics
No ratings yet
AP Statistics
42 pages
IMDB Movie Analysis
No ratings yet
IMDB Movie Analysis
80 pages
SAS Cluster Project Report
100% (1)
SAS Cluster Project Report
24 pages
RDP 513 Linearity (Hematology) - Example - Blank - Copy - Id - 8879424
No ratings yet
RDP 513 Linearity (Hematology) - Example - Blank - Copy - Id - 8879424
3 pages
Data Mining Part 02 Eng
No ratings yet
Data Mining Part 02 Eng
12 pages
THE IMPACT OF MERGER AND ACQUISITION ON FINANCIAL PERFORMANCE IN INDONESIA
No ratings yet
THE IMPACT OF MERGER AND ACQUISITION ON FINANCIAL PERFORMANCE IN INDONESIA
14 pages
How To Write A Lab Report 6
No ratings yet
How To Write A Lab Report 6
11 pages
Section F S-6001 (2008) Bearing Temperature Performance
No ratings yet
Section F S-6001 (2008) Bearing Temperature Performance
2 pages
Carbon Black-Evaluation of Standard Reference Blacks
No ratings yet
Carbon Black-Evaluation of Standard Reference Blacks
3 pages
DATA MANAGEMENT QUIZ
No ratings yet
DATA MANAGEMENT QUIZ
4 pages
davies1993
No ratings yet
davies1993
12 pages
STT 215 Exam 1 Example
No ratings yet
STT 215 Exam 1 Example
5 pages
UNIT - 2 .DataScience 04.09.18
No ratings yet
UNIT - 2 .DataScience 04.09.18
53 pages

Machine

Uploaded by

Machine

Uploaded by

DESIGN AND ANALYSIS OF

Computer Science & Engineering (Computer Science Eng.)

Type your text (Session 2024-25)

SUBMITTED TO: SUBMITTED BY:

Department of Computer Science & Engineering (CSE)

capability to learn without being explicitly programmed. ML is

come across. As it is evident from the name, it gives the

computer that makes it more similar to humans: The ability to

What is Exploratory Data Analysis (EDA)?

 Libraries like “pandas”, “matplotlib” are imported to use inbuilt functions

 df.head(): This method returns the first 5 rows of the DataFrame by

 df.describe(), which gives the count, mean, standard deviation, minimum,

 df.columns.tolist() converts the column names of the DataFrame ‘df’ into

You might also like