0% found this document useful (0 votes)
20 views4 pages

Assignment_2--3-

This document outlines the requirements for R Assignment 2, due on May 26, 2024, covering coding questions related to statistical concepts. Students must submit original work, including code and explanations, while using R functions for calculations. The assignment includes tasks such as producing histograms, calculating proportions, and finding confidence intervals based on given distributions.

Uploaded by

raghav.k271205
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
20 views4 pages

Assignment_2--3-

This document outlines the requirements for R Assignment 2, due on May 26, 2024, covering coding questions related to statistical concepts. Students must submit original work, including code and explanations, while using R functions for calculations. The assignment includes tasks such as producing histograms, calculating proportions, and finding confidence intervals based on given distributions.

Uploaded by

raghav.k271205
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 4

R Assignment 2

Type your Name, and student number here

Due on 26/05/2024

This assignment covers the R coding questions related to Unit 2, 3 and 4. Solutions must be submitted no
later than 11:59PM CDT on Sunday, May 26th .

Each student must submit their own assignment. You are allowed to discuss the problems among
yourselves, but your submission must reflect your original work. To complete this assignment, add code
as needed into the r code chunks and replace the commented parts with your text explanations to answer
the questions as given below. Do not delete the question text. If you have an issue that you can’t resolve
without someone looking at your work, please ask your professor or a lab TA to look at it. To compile this
document as a pdf to submit or to view your intermediate work/output (I suggest you compile it often as
you are working), click on the knit button above this window. You should read all question text from
the knitted PDF or else the question may not appear sensibly. All numerical calculations must be
done using R functions unless told otherwise. Please make sure you read the question instructions in the
knitted pdf version of this document and not in the r markdown text.

Setup
Replace 1111111 with your student id in the code below in order to generate the dataset you will use for this
assignment. This part is not worth marks but you will receive a 0 on your assignment if it is not completed
correctly. This will create a vector x that contains the data you will use for this assignment and perform your
functions on.
set.seed(8004450)
mu <- sample(4:8,1)
sd <- sample(seq(2.1,2.4,by=0.1),1)

set.seed(8004450)
x <- rnorm(100000,mu,sd)

paste('Your population distribution follows a Normal distribution with')

## [1] "Your population distribution follows a Normal distribution with"


paste('Mean: ',mu,', standard deviation: ', sd)

## [1] "Mean: 5 , standard deviation: 2.2"

1
Questions
1. Produce a histogram for the X vector with the title “Normal Distribution”. Plot the density curve on
top of the histogram and change the color and size of the curve line as you want.(4 marks)
#type your answer
hist(x, probability = TRUE, main = "Normal Distribution")
density_X <- density(x)
lines(density_X, col = "blue")

Normal Distribution
0.15
0.10
Density

0.05
0.00

−5 0 5 10 15

x
2. Check the output of the first R chunk under the Setup section and you will find the population mean
and standard deviation of the above distribution. The population distribution is normally distributed.
Answer the following questions using suitable R functions.

• 2.1 Find the proportion of data which falls below 3.8.(2 marks)
#type your answer
proportion_below_3_8 <- pnorm(3.8,mu,sd)
proportion_below_3_8

## [1] 0.2927205

• 2.2. Find the proportion of data which falls within 25th percentile and the value 9 (x=9).(3 marks)
#type your answer
proportion_within_givenRange <- pnorm(9,mu,sd)-pnorm(qnorm(0.25,mu,sd),mu,sd)
proportion_within_givenRange

2
## [1] 0.7154818

• 2.3. Assume a sample of 5 observations is selected from the above population distribution. What is the
probability that the sample mean is more than 8.3? (3 marks)
#type your answer
s_mean<- 8.3
s_size<- 5
probability <- 1 - pnorm((s_mean - mu)/(sd / sqrt(s_size)))
probability

## [1] 0.0003981151

3. Consider a sample of 40 observations that we selected from a non-normal distribution with mean 20 and
standard deviation 3.5. Do we have enough information about the mean distribution of this sample?
If so, find the value such that 30% of the sample mean observations are greater than this value. Else
comment the reason for not calculating the answer. (2 marks)

type the comment within these asterisk marks


#type your answer if you have enough information to calculate the answer
mean <- 20
size <- 40
deviation <- 3.5
# Here the confidence level is 30%.
z <- qnorm(1-0.30)
sample_mean <- (z * (deviation/sqrt(size))) + mean
sample_mean

## [1] 20.2902

4. Consider the above non-normal distribution in Q3, and a sample with 10 observations.Do we have
enough information about the mean distribution of this sample? If so, find the probability that sample
mean is falls between 15 and 21. Else comment the reason for not calculating the answer. (2 marks)

type the comment within these asterisk marks


#type your answer if you have enough information to calculate the answer
#No, we can not find the probability of sample mean lying between 15 and 21,
#since the distribution is not normal, and we only have a sample size of 10, thus we will not
#be able to use the central limit theorem for which the sample size has to be greater than 30.

5. A company manufactures light bulbs that have a lifespan uniformly distributed between 800 and 1200
hours. Provide the R code and print the results for the following questions

• 5.1 Find the probability that a randomly chosen light bulb lasts at most 900 hours. Provide the R code
and the result.(2 marks)

3
#type your answer
probability<- punif(900,800,1200)
paste('Probability of getting a bulb which lasts at most 900 hours:' , probability)

## [1] "Probability of getting a bulb which lasts at most 900 hours: 0.25"

• 5.2 Find the lifespan such that 95% of the light bulbs last longer than this time.(2 marks)
#type your answer
qunif(0.05,800,1200)

## [1] 820

• 5.3 The company claims that 60% of their light bulbs last more than 1000 hours. If you randomly select
100 light bulbs, what is the probability that at least 65 of them will last more than 1000 hours? (3
marks)
#type your answer
1 - pnorm(0.65 , 0.60 , sqrt((0.6 * 0.4)/100))

## [1] 0.1537171

6. You are researching the average time (in minutes) it takes to commute to work in a city. The commute
time distribution known to follow a normal distribution with standard deviation is 8 minutes. You
collected a sample of 35 commuters and found the sample mean is 42 minutes.

Calculate the 95% confidence interval for the true mean commute time for all commuters in the city. Print
lower limit and upper limit.(2 marks)
#type your answer
sample_mean <- 42
size <- 35
deviation <- 8
confidence_level <- 0.95
z <- qnorm(1-((1-confidence_level)/2))

margin_ofError <- (z * deviation)/sqrt(size)

lower_lim <- sample_mean - margin_ofError


lower_lim

## [1] 39.34964
upper_lim <-sample_mean + margin_ofError
upper_lim

## [1] 44.65036

You might also like