0% found this document useful (0 votes)
737 views6 pages

Data Analysis for Cardio Fitness

This document summarizes the analysis of a cardiovascular fitness dataset containing 180 observations and 9 variables. The author imports the dataset into R, examines its structure and dimensions, and generates descriptive statistics. Graphical explorations include histograms and boxplots of variables like age, education, usage, and income. Key insights are that the majority of cardio product users are between ages 20-25, have an education level of 14-16 years, and an income between $40,000-$60,000.

Uploaded by

rats100
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
737 views6 pages

Data Analysis for Cardio Fitness

This document summarizes the analysis of a cardiovascular fitness dataset containing 180 observations and 9 variables. The author imports the dataset into R, examines its structure and dimensions, and generates descriptive statistics. Graphical explorations include histograms and boxplots of variables like age, education, usage, and income. Key insights are that the majority of cardio product users are between ages 20-25, have an education level of 14-16 years, and an income between $40,000-$60,000.

Uploaded by

rats100
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd

Mini Poject-Cardio Good Fitness

Module 1

Submitted By:

Rathin Kukreja
Project Aim:

Minimum Steps for exploration:

1. Importing the dataset into R

2. Understanding the structure of dataset

3. Graphical exploration

4. Descriptive statistics

5. Insights from the dataset


Setting Working Directory:

> setwd("C:/Users/RATHIN KUKREJA/Desktop")

Importing dataset into R

> cardio_data_set<-read.csv("CardioGoodFitness.csv")
> cardio_data_set

Dimensions of Data Set

> dim(cardio_data_set)
[1] 180 9

Summarising the Data Set

> summary(cardio_data_set)

Product Age Gender Education MaritalStatus


TM195:80 Min. :18.00 Female: 76 Min. :12.00 Partnered:107
TM498:60 1st Qu.:24.00 Male :104 1st Qu.:14.00 Single : 73
TM798:40 Median :26.00 Median :16.00
Mean :28.79 Mean :15.57
3rd Qu.:33.00 3rd Qu.:16.00
Max. :50.00 Max. :21.00
Usage Fitness Income Miles
Min. :2.000 Min. :1.000 Min. : 29562 Min. : 21.0
1st Qu.:3.000 1st Qu.:3.000 1st Qu.: 44059 1st Qu.: 66.0
Median :3.000 Median :3.000 Median : 50597 Median : 94.0
Mean :3.456 Mean :3.311 Mean : 53720 Mean :103.2
3rd Qu.:4.000 3rd Qu.:4.000 3rd Qu.: 58668 3rd Qu.:114.8
Max. :7.000 Max. :5.000 Max. :104581 Max. :360.0

Structure of Each Feature

> str(cardio_data_set)

'data.frame': 180 obs. of 9 variables:


$ Product : Factor w/ 3 levels "TM195","TM498",..: 1 1 1 1 1 1 1 1 1
1 ...
$ Age : int 18 19 19 19 20 20 21 21 21 21 ...
$ Gender : Factor w/ 2 levels "Female","Male": 2 2 1 2 2 1 1 2 2 1
...
$ Education : int 14 15 14 12 13 14 14 13 15 15 ...
$ MaritalStatus: Factor w/ 2 levels "Partnered","Single": 2 2 1 2 1 1 1 2
2 1 ...
$ Usage : int 3 2 4 3 4 3 3 3 5 2 ...
$ Fitness : int 4 3 3 3 2 3 3 3 4 3 ...
$ Income : int 29562 31836 30699 32973 35247 32973 35247 32973 352
47 37521 ...
$ Miles : int 112 75 66 85 47 66 75 85 141 85 ...
Bar Plotting of Age
> barplot(table(cardio_data_set$Age))

Panelling Graphics
> par (mfrow=c(3,3))
> hist(cardio_data_set$Age, main="Age Distribution", xlab="Age", ylab="Fre
quency", col="blue")
> hist(cardio_data_set$Education, main="Education", xlab="Education", ylab
="Frequency", col="blue")
> hist(cardio_data_set$Usage, main="Usage", xlab="Usage", ylab="Frequency"
, col="blue")
> boxplot(cardio_data_set$Age,horizontal=TRUE, main="Age Distribution", xl
ab="Age", ylab="Frequency", col="red")
> boxplot(cardio_data_set$Education,horizontal=TRUE, main="Education", xla
b="Education", ylab="Frequency", col="red")
> boxplot(cardio_data_set$Usage,horizontal=TRUE, main="Usage", xlab="Usage
", ylab="Frequency", col="red")

It is clear from the below graphs majority Age group using cardio products is between 20-25 years
and education level between 14-16.
As evident from the below graphs that majority people using cardio fitness products have income
between 40 K units to 60K units.

You might also like