0% found this document useful (0 votes)

10 views10 pages

Quiz 2 Solution Id 22070144

Uploaded by

Asif Ali

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

10 views10 pages

Quiz 2 Solution Id 22070144

Uploaded by

Asif Ali

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

Quiz 2

Katrodiya tapankumar Ashokbhai

2024-08-18

Question 1
Load the data
dataset = read.csv("dataset.csv")

I look at the top few entries using the head function to confirm data are loaded.
head(dataset, n= 10)

## number
## 1 139
## 2 116
## 3 122
## 4 115
## 5 122
## 6 126
## 7 107
## 8 112
## 9 112
## 10 121

To understand data
dim(dataset)

## [1] 52 1

nrow(dataset)

## [1] 52

ncol(dataset)

## [1] 1

names(dataset)

## [1] "number"

To visualize data I create a histogram

hist(dataset$number, xlab ="Number", main = "Datasets of the number",
col = "lightgreen" )
To draw a box plot
boxplot(dataset$number, horizontal = TRUE, pch = 16, main = "Dataset
of Number", col = "lightblue", xlab ="Number")
To compute the mean, median, standard deviation and First Quartile (Q1)
I compute the mean
mean(dataset$number)

## [1] 114.3269

So the mean is 114.33

I compute the median
median(dataset$number)

## [1] 121.5

So the median is 121.5

I compute the standard deviation
sd(dataset$number)

## [1] 35.40894

So the standard deviation is 35.41

For the first quantile
quantile(dataset$number, 1/4)
## 25%
## 107

So the first quantile is 107.0

comment on the shape of the distribution
For the Mean :
The mean (114.33) is less than the median (121.5), which suggests that the distribution
may be left-skewed (negatively skewed).
For the Standard Deviation :
The standard deviation (35.41) is relatively large compared to the mean, indicating that
there is significant variability in the data.
For First Quartile (Q1)
The first quartile (Q1) is 107.0, which is fairly close to the mean (114.33). This proximity
indicates that a significant portion of the data is concentrated below the median
Conclusion
The fact that the mean is less than the median suggests that the distribution is left-skewed,
or negatively skewed. The majority of the data is above the mean, but some lower numbers
push the mean downward, as indicated by this skewness.

Question 2

Step 1: To upload the data

diabetes = read.csv("diabetes.csv")

I look at the top few entries to confirm data are loaded.

head(diabetes$HDL, n = 10)

## [1] 27.4 51.4 42.1 53.8 57.6 32.5 47.6 25.9 47.3 84.6

To understand data
dim(diabetes)

## [1] 87 6

nrow(diabetes)

## [1] 87

ncol(diabetes)
## [1] 6

names(diabetes)

## [1] "sex" "BG" "HbA1c" "LDL" "HDL" "Tri"

Step 2: Define hypothesis

H0: No difference in HDL between males and females
H1: the mean HDL levels are greater in females than males.

Step 3: pre-checking data

I am interested in HDL’data
I quickly look at the summary of HDL
summary(diabetes$HDL)

## Min. 1st Qu. Median Mean 3rd Qu. Max.

## 18.90 39.80 47.10 45.58 51.15 84.60

I look the frequency of the males and females of this data

table(diabetes$sex)

##
## Female Male
## 44 43

barplot(table(diabetes$sex), main = "male and female count", col =

c("lightpink", "lightblue"))
To visualize data I create a histogram
hist(diabetes$HDL, xlab = "HDL", main = "Data of HDL", col =
"lightpink", breaks = 15)
I look at the data
split by the sex variable
aggregate(HDL~sex, data = diabetes, mean)

## sex HDL
## 1 Female 46.76591
## 2 Male 44.35814

To visualize I create a histogram and box plot

library(lattice)
histogram(~HDL|sex, data = diabetes)
boxplot(HDL~sex, data = diabetes, horizontal = TRUE, pch = 16)
Step 4: compute sample statistic
The actual difference in means can be computed from the aggregate data
HDL = -diff(aggregate(HDL~sex,diabetes, mean)$HDL)
HDL

## [1] 2.40777

To simulate the sex variable I use the sample function.

sex.sim = sample(diabetes$sex)

Step 5: Generate randomized distribution under Null0

To create a new sample with the same size as the original sample I use the replicate
function
HDL0 = replicate(1000,{
sex.sim = sample(diabetes$sex)
-diff(aggregate(HDL~sex.sim, data = diabetes, mean)$HDL)
})

hist(HDL0, main = "Replicate's sample Histogram", xlab = "Thousand of

HDL sample", col = "lightgoldenrod")
Step 6: compute p-value
pVal = mean(HDL0 > HDL)
pVal

## [1] 0.125

So p-value is 0.12
I run a t.test
t.test(HDL~sex, data= diabetes, alternative="greater",)

##
## Welch Two Sample t-test
##
## data: HDL by sex
## t = 1.1631, df = 68.33, p-value = 0.1244
## alternative hypothesis: true difference in means between group
Female and group Male is greater than 0
## 95 percent confidence interval:
## -1.04406 Inf
## sample estimates:
## mean in group Female mean in group Male
## 46.76591 44.35814

I extract t-statistic
t.test(HDL~sex, data= diabetes, alternative="greater")$statistic

## t
## 1.163112

so t-statistic is 1.16

Step 7: Conclusion
The p-value is 0.12, which is greater than the significance level of 0.05. Therefore, we do
not reject the null hypothesis.
There is insufficient evidence at the 0.05 significance level to conclude that the mean HDL
levels are greater in females than in males.

Statistical Analysis of Health Data
No ratings yet
Statistical Analysis of Health Data
11 pages
Data Analysis of Health Metrics
No ratings yet
Data Analysis of Health Metrics
12 pages
R Programming Basics and Data Analysis
No ratings yet
R Programming Basics and Data Analysis
18 pages
Framingham Data Analysis: Blood Pressure & Cholesterol
No ratings yet
Framingham Data Analysis: Blood Pressure & Cholesterol
12 pages
MSc Epidemiology: Mixed Models Exercises
No ratings yet
MSc Epidemiology: Mixed Models Exercises
26 pages
Choosing and Performing Statistical Tests
No ratings yet
Choosing and Performing Statistical Tests
7 pages
ProbList5 24 SLN
No ratings yet
ProbList5 24 SLN
9 pages
HW 2
No ratings yet
HW 2
12 pages
R Data Analysis: Vectors & Visualization
No ratings yet
R Data Analysis: Vectors & Visualization
7 pages
Analyzing BRFSS Data in R
No ratings yet
Analyzing BRFSS Data in R
7 pages
R Programming Challenges for Data Analysis
No ratings yet
R Programming Challenges for Data Analysis
11 pages
Data Analysis with R: Summary Stats & Graphs
No ratings yet
Data Analysis with R: Summary Stats & Graphs
9 pages
Employee Data Analysis in R
No ratings yet
Employee Data Analysis in R
9 pages
BES - R Lab 5
No ratings yet
BES - R Lab 5
7 pages
Assignment# 06
No ratings yet
Assignment# 06
16 pages
Data Types and Probability Calculations
100% (2)
Data Types and Probability Calculations
12 pages
Beta Distribution Analysis and Regression
No ratings yet
Beta Distribution Analysis and Regression
6 pages
Statistical Analysis Homework Guide
No ratings yet
Statistical Analysis Homework Guide
12 pages
BM-1, Applied Statistics, Lesson 2: Comparing Two Groups (And One Group)
No ratings yet
BM-1, Applied Statistics, Lesson 2: Comparing Two Groups (And One Group)
39 pages
ASSIGNMENT NO - 2, FDAS - SUMANYAKUMARI - Bfia
No ratings yet
ASSIGNMENT NO - 2, FDAS - SUMANYAKUMARI - Bfia
6 pages
Statistical Analysis of Categorical and Quantitative Data
No ratings yet
Statistical Analysis of Categorical and Quantitative Data
8 pages
BIOL 182 Lab: Data Visualization Tips
No ratings yet
BIOL 182 Lab: Data Visualization Tips
26 pages
Answer For Assignment I For Biostatistics Course 2024 PG1 1
No ratings yet
Answer For Assignment I For Biostatistics Course 2024 PG1 1
27 pages
Two-Way ANOVA Analysis Guide
No ratings yet
Two-Way ANOVA Analysis Guide
7 pages
Data Visualization and Probability Analysis in R
No ratings yet
Data Visualization and Probability Analysis in R
31 pages
Data Types and Probability Analysis
100% (1)
Data Types and Probability Analysis
16 pages
Sample Independent Project With Different Data
No ratings yet
Sample Independent Project With Different Data
7 pages
Stata Commands for Data Analysis
No ratings yet
Stata Commands for Data Analysis
8 pages
Summary Statistics and Data Analysis in R
No ratings yet
Summary Statistics and Data Analysis in R
11 pages
R Statistical Analysis and Sampling Techniques
No ratings yet
R Statistical Analysis and Sampling Techniques
38 pages
Introduction To Biostatistics 23
No ratings yet
Introduction To Biostatistics 23
8 pages
Lab Test
No ratings yet
Lab Test
7 pages
Project of Biostatistics#02-RaeesaAli-MS - BIOTECH
No ratings yet
Project of Biostatistics#02-RaeesaAli-MS - BIOTECH
27 pages
Data Types and Statistical Analysis Guide
100% (7)
Data Types and Statistical Analysis Guide
18 pages
Systolic BP and BMI Analysis in STATA
No ratings yet
Systolic BP and BMI Analysis in STATA
5 pages
Biostatistics R: Data Transformation & Visualization
No ratings yet
Biostatistics R: Data Transformation & Visualization
38 pages
Statistical Inference in ECON20003 Tutorial
No ratings yet
Statistical Inference in ECON20003 Tutorial
26 pages
Statistics & Probability Guide
No ratings yet
Statistics & Probability Guide
15 pages
QT Report
No ratings yet
QT Report
20 pages
96 Paired Ttest
No ratings yet
96 Paired Ttest
4 pages
ProbList4 24 SLN
No ratings yet
ProbList4 24 SLN
19 pages
PH6205 RTutorial 2
No ratings yet
PH6205 RTutorial 2
15 pages
Mid-Semester Exam: Biostatistics 2021
No ratings yet
Mid-Semester Exam: Biostatistics 2021
4 pages
Heart Failure Data Analysis in R
No ratings yet
Heart Failure Data Analysis in R
42 pages
Logistic Regression in Malay Context
No ratings yet
Logistic Regression in Malay Context
44 pages
Diabetes Data Analysis and Statistics
No ratings yet
Diabetes Data Analysis and Statistics
5 pages
Flint Water Study and Income Analysis
No ratings yet
Flint Water Study and Income Analysis
22 pages
LSS - GB - Day3
No ratings yet
LSS - GB - Day3
34 pages
Biostatistics Exercise 1
50% (2)
Biostatistics Exercise 1
2 pages
Group Assignment on Statistics Analysis
0% (1)
Group Assignment on Statistics Analysis
18 pages
R Programming: Statistical Analysis Assignment
No ratings yet
R Programming: Statistical Analysis Assignment
8 pages
Textbook Practice Problems 1
No ratings yet
Textbook Practice Problems 1
39 pages
Cholesterol Prediction with Weka
No ratings yet
Cholesterol Prediction with Weka
18 pages
Diabetes Data Analysis and Statistics
No ratings yet
Diabetes Data Analysis and Statistics
9 pages
Assignment STAT5002
No ratings yet
Assignment STAT5002
5 pages
STATS 10 Assignment 1
No ratings yet
STATS 10 Assignment 1
7 pages
Heart Failure Prediction Dataset Analysis
No ratings yet
Heart Failure Prediction Dataset Analysis
8 pages
Doughnut Fat Absorption Analysis
No ratings yet
Doughnut Fat Absorption Analysis
10 pages
Assignment 1 Financial Performance Analysis Procter & Gamble Vs Colgate-Palmolive
No ratings yet
Assignment 1 Financial Performance Analysis Procter & Gamble Vs Colgate-Palmolive
13 pages
Lab+1+ +Lab+Manual
No ratings yet
Lab+1+ +Lab+Manual
22 pages
ICT285 Assignment 1
No ratings yet
ICT285 Assignment 1
9 pages
Account Statement Asif Ali: Subhanallah Colony, Imran Communication, Arbab Jan Muhammad Shoro Road, Kotri, PAKISTAN
No ratings yet
Account Statement Asif Ali: Subhanallah Colony, Imran Communication, Arbab Jan Muhammad Shoro Road, Kotri, PAKISTAN
19 pages
CBT for Generalized Anxiety Disorder
No ratings yet
CBT for Generalized Anxiety Disorder
3 pages
Hyperloop Technology in Data Centers
No ratings yet
Hyperloop Technology in Data Centers
4 pages
Edge Computing in Internet of Things (IoT) Real-Time Data Processing and Decision Making
No ratings yet
Edge Computing in Internet of Things (IoT) Real-Time Data Processing and Decision Making
4 pages
Amanat DCF Analysis
No ratings yet
Amanat DCF Analysis
6 pages
SQL and PL/SQL Exercises Guide
No ratings yet
SQL and PL/SQL Exercises Guide
37 pages
Intelligent Tutoring System for Geometry
No ratings yet
Intelligent Tutoring System for Geometry
18 pages
Shea Seed Market Dynamics in Ghana
No ratings yet
Shea Seed Market Dynamics in Ghana
8 pages
Lec 20
No ratings yet
Lec 20
24 pages
Maths Paper 2 Edexcel 2025
92% (13)
Maths Paper 2 Edexcel 2025
25 pages
Gr. 11 Math P2 Nov 2012
No ratings yet
Gr. 11 Math P2 Nov 2012
14 pages
SYBSc CS SEC Statistical Analysis Using R Software 2025-26
No ratings yet
SYBSc CS SEC Statistical Analysis Using R Software 2025-26
4 pages
R Programming Data Analysis Assignment
No ratings yet
R Programming Data Analysis Assignment
21 pages
GDP Per Capita and Life Expectancy Correlation
No ratings yet
GDP Per Capita and Life Expectancy Correlation
16 pages
Healthcare Data Analysis Insights
No ratings yet
Healthcare Data Analysis Insights
9 pages
Ggplot2 Tutorial
100% (1)
Ggplot2 Tutorial
78 pages
IGCSE Files Telegram Channel: Thursday 22 October 2020
No ratings yet
IGCSE Files Telegram Channel: Thursday 22 October 2020
24 pages
Predictive Modeling for Lead Conversion
No ratings yet
Predictive Modeling for Lead Conversion
21 pages
Sample Questions
No ratings yet
Sample Questions
5 pages
Printable - 11-2 - Lesson Quiz
No ratings yet
Printable - 11-2 - Lesson Quiz
1 page
Bes Project 2021
No ratings yet
Bes Project 2021
15 pages
Diskusi 7 BING4102
100% (1)
Diskusi 7 BING4102
8 pages
Hughes Et Al 2019 PDF
No ratings yet
Hughes Et Al 2019 PDF
14 pages
GCSE Statistics Exam Paper 1 Higher Tier
No ratings yet
GCSE Statistics Exam Paper 1 Higher Tier
24 pages
Wgu C784 Applied Healthcare Statistics Final Exam 2024 - 2025 Latest Update (Graded A+)
No ratings yet
Wgu C784 Applied Healthcare Statistics Final Exam 2024 - 2025 Latest Update (Graded A+)
12 pages
Assignment 8614
No ratings yet
Assignment 8614
19 pages
12 IP 2022 Data Visualization
No ratings yet
12 IP 2022 Data Visualization
24 pages
Capstone 1 Corizo
No ratings yet
Capstone 1 Corizo
2 pages
Chapter 13
No ratings yet
Chapter 13
56 pages
Test 1 Practice Question - A10e8ec0 3e94 4e84 8cdf 458cf364d55f
No ratings yet
Test 1 Practice Question - A10e8ec0 3e94 4e84 8cdf 458cf364d55f
2 pages
Cambridge IGCSE: Mathematics 0580/21
No ratings yet
Cambridge IGCSE: Mathematics 0580/21
16 pages
Exploratory Data Analysis in Data Science
No ratings yet
Exploratory Data Analysis in Data Science
31 pages
Introduction To Probability and Statistics 13th Edition: de Mendenhall, Beaver Et Beaver. STT1700 (Automne 2008)
No ratings yet
Introduction To Probability and Statistics 13th Edition: de Mendenhall, Beaver Et Beaver. STT1700 (Automne 2008)
47 pages
What Would You Like To Show? Let Your Data Speak: Chart
100% (1)
What Would You Like To Show? Let Your Data Speak: Chart
1 page
King Crab Population Analysis in Tableau
No ratings yet
King Crab Population Analysis in Tableau
19 pages
Python for Business Analysts
No ratings yet
Python for Business Analysts
21 pages
Chapter 03 - Solutions Manual-80000816
No ratings yet
Chapter 03 - Solutions Manual-80000816
24 pages
6 IH June 23
No ratings yet
6 IH June 23
44 pages

Quiz 2 Solution Id 22070144

Uploaded by

Quiz 2 Solution Id 22070144

Uploaded by

Quiz 2

Katrodiya tapankumar Ashokbhai

To visualize data I create a histogram

So the mean is 114.33

So the median is 121.5

So the standard deviation is 35.41

So the first quantile is 107.0

Step 1: To upload the data

I look at the top few entries to confirm data are loaded.

## [1] "sex" "BG" "HbA1c" "LDL" "HDL" "Tri"

Step 2: Define hypothesis

Step 3: pre-checking data

## Min. 1st Qu. Median Mean 3rd Qu. Max.

I look the frequency of the males and females of this data

barplot(table(diabetes$sex), main = "male and female count", col =

To visualize I create a histogram and box plot

To simulate the sex variable I use the sample function.

Step 5: Generate randomized distribution under Null0

hist(HDL0, main = "Replicate's sample Histogram", xlab = "Thousand of

You might also like