100% found this document useful (1 vote)

52 views11 pages

Week-7_GA_Solution_1

The document contains solutions to various statistical problems related to data science, including the application of Chebyshev's inequality, empirical distributions, sample means, variances, and the weak law of large numbers. It covers topics such as calculating sample sizes, variances of linear combinations of random variables, and bounds on probabilities. Each problem is presented with a detailed solution and relevant calculations.

Uploaded by

bjoshita05

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

100% found this document useful (1 vote)

52 views11 pages

Week-7_GA_Solution_1

Uploaded by

bjoshita05

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 11

Statistics for Data Science - 2

Graded assignment Solution- Sept 2024

Week 7

1. Suppose X1 , X2 , . . . , Xn are n iid random variables with mean µ and variance σ 2 = 16.
Using Chebyshev’s inequality, find the minimum value of n such that

P (| X − µ |< 1) > 0.90.

Answer: 160
Solution:
σ2
E[X̄] = µ, Var[X̄] =
n
Note that X̄ = X1 + X2 + . . . + Xn
By Chebyshev’s inequality, we have
σ2
P X −µ <δ ≥1− 2
nδ
And given

P (| X − µ |< 1) > 0.90

σ2
⇒ 1 − 2 > 0.90
nδ
16
⇒1− > 0.90
n
16
⇒ 1 − 0.90 >
n
16
⇒ 0.10 >
n
16
⇒n>
0.10
⇒ n > 160

Hence, the minimum of n should be 160.

2. Consider a sample 1, 0, 1, 0, 1, 1, 0, 1, 1, 1 from the Bernoulli(0.6) distribution.

(a) Compute the empirical distribution of the sample.
A. p(0) = 0.4, p(1) = 0.6
B. p(0) = 0.3, p(1) = 0.7
C. p(0) = 0.5, p(1) = 0.5
Course: Statistics for Data Science - II Page 2 of 11

D. p(0) = 0.7, p(1) = 0.3

Answer: B

Solution :
Since, the empirical distribution is the discrete distribution with PMF

#(Xi = t)
p(t) =
n
#(Xi = t) denotes the number of times t occurs in the samples.
Therefore,
3
p(0) = = 0.3
10
And
7
p(1) = = 0.7
10
Hence, option (B) is correct.

(b) Compute the sample mean. Enter the answer correct to one decimal place.
Answer: 0.7
Solution:
X 1 + X2 + . . . + X 1 0
X̄ =
n
1+0+1+0+1+1+0+1+1+1
=
10
7
=
10
= 0.7

3. Let X1 , X2 , X3 are three independent and identically distributed random variables with
mean µ and variance σ 2 . Given below are 3 different formulations of sample mean.
(Observe that E[A] = E[B] = E[C]).

X1 + X 2 + X3
A=
3

B = 0.1X1 + 0.3X2 + 0.6X3

C = 0.2X1 + 0.3X2 + 0.5X3

Course: Statistics for Data Science - II Page 3 of 11

Choose the correct option from the following:

(a) Var(A) = Var(B) = Var(C)
(b) Var(A) ≥ Var(B) ≥ Var(C)
(c) Var(A) ≤ Var(B) ≤ Var(C)
(d) Var(A) ≤ Var(C) ≤ Var(B)

Solution:
Let X1 , X2 , X3 ∼ i.i.d. X, where E[X] = µ, Var(X) = σ 2

X1 + X2 + X3
Var(A) = Var
3
1
= (Var[X1 ] + Var[X2 ] + Var[X3 ])
9
1 σ2
= (3σ 2 ) =
9 3

Var(B) = Var(0.1X1 + 0.3X2 + 0.6X3 )

= 0.01Var[X1 ] + 0.09Var[X2 ] + 0.36Var[X3 ]
= 0.46σ 2

Var(C) = Var(0.2X1 + 0.3X2 + 0.5X3 )

= 0.04Var[X1 ] + 0.09Var[X2 ] + 0.25Var[X3 ]
= 0.38σ 2

Therefore, Var(B) ≥ Var(C) ≥ Var(A).

4. A random sample of size 25 is collected from a normal population with a mean of 50

and a standard deviation of 5. Find the variance of the sample mean.
Solution:
We know that the variance of the sample mean X is given by

σ2
Var[X] =
n
52
=
25
=1
Course: Statistics for Data Science - II Page 4 of 11

X
5. A fair die is rolled 100 times. Let denote the number of times six is obtained. Find
100
X 1
a bound for the probability that differs from by less than 0.1 using the weak law
100 6
of large numbers.

5
(a) at least
36
31
(b) at least
36
5
(c) at most
36
31
(d) at most
36
Solution:
X denotes the number of times six is obtained on rolling a fair die 100 times. Let
X1 , X2 , . . . , X100 be 100 i.i.d. samples such that
(
1 if six appears on rolling a fair die
Xi =
0 otherwise

1 5
E[Xi ] = µ = and Var(Xi ) = σ 2 =
6 36
Notice that X = X1 + X2 + X3 + · · · + X100 .

X 1
To find: Bound on P − < 0.1 . By the weak law of large numbers, we have
100 6
σ2
P (|X − µ| < δ) ≥ 1 − 2
nδ

5
X 1 36
⇒P − < 0.1 ≥ 1 −
100 6 100 × 0.01

X 1 5 31
⇒P − < 0.1 ≥ 1 − =
100 6 36 36

6. Let X1 , X2 , . . . , X5 be i.i.d. samples whose distribution has a mean of 20 and variance

of 4. Suppose the sample variance is defined as

(X1 − X)2 + · · · + (X5 − X)2

S2 =
5
X1 +X2 +···+X5
where X = 5
. Find the expected value of S 2 .
Course: Statistics for Data Science - II Page 5 of 11

Solution:

σ2 4
E[X] = µ = 20 and Var[X] = = = 0.8.
n 5
" n
#
1 X
E[S 2 ] = E (Xi − X)2
n i=1
" n #
1 X
= E (Xi − X)2
n
" i=1
n
#
1 X 2
= E (Xi2 + X − 2Xi X)
n
" i=1
n
#
1 X 2
= E Xi2 + nX − 2nXX
n
" i=1
n
#
1 X 2 2
= E Xi2 + nX − nX
n
" i=1
n
#
1 X
= E Xi2
n
" ni=1 #
1 X 2
= E[Xi2 ] − nE[X ]
n i=1
" n 2 #
1 X 2 σ
= (σ + µ2 ) − n + µ2
n i=1 n
1
(nσ 2 + nµ2 ) − (σ 2 + nµ2 )

=
n
(n − 1)σ 2
=
n

4
Here, n = 5, therefore, E[S 2 ] = × 4 = 3.2.
5
7. Suppose Xi ∼ Normal 0, i12 , where i = 1, 2, . . . , 9 and X1 , X2 , . . . , X9 are independent

of each other. Let Y be a random variable defined as Y = 9i=1 iXi . Find the variance
P
of Y .
Solution
Course: Statistics for Data Science - II Page 6 of 11

9
!
X
Var(Y ) = Var iXi
i=1
= Var(X1 + 2X2 + 3X3 + · · · + 9X9 )
= Var(X1 ) + Var(2X2 ) + · · · + Var(9X9 )
= Var(X1 ) + 4Var(X2 ) + · · · + 81Var(X9 )

1 1 1
= 2 + 4 2 + · · · + 81 2
1 2 9
=9

8. A random sample of size 50 is collected from a population P , where P ∼ Uniform[0, 12].

Find a lower bound on the probability that the sample mean will be at most 3 units
away from the actual mean using the weak law of large numbers.

Solution:

P ∼ Uniform[0, 12]
0 + 12 (12 − 0)2 144
E[P ] = µ = = 6, Var(P ) = σ 2 = = = 12
2 12 12

By weak law of large numbers, we have

σ2
P (|X − µ| < δ) ≥ 1 −
nδ 2
12 73
P |X − µ| < 3 ≥ 1 − = ≈ 0.9733
50 × 9 75

9. Suppose a random sample is used to estimate the proportion of voters in a city. If the
sample proportion is roughly 0.45, what sample size is necessary so that the standard
deviation of the sample proportion is 0.02?
Solution
Let the random variable X represent that the selected candidate is a voter.
Let Xi be defined as
(
1, if the selected candidate is a voter
Xi =
0, otherwise

Define an event A as A : X = 1.
It is given that P (A) = 0.45.
P (A)(1 − P (A))
We know that Var(S(A)) = .
n
Course: Statistics for Data Science - II Page 7 of 11

r
p(1 − p)
= 0.02
r n
(0.45)(0.55)
= 0.02 =⇒ n = 618.75 ≈ 619
n

10. The average life (in years) of an electronic watch follows an exponential distribution with
1
parameter . Find the lower bound on the probability that the mean life of a random
2
sample of 50 such watches falls between 1 and 3 years. Enter your answer correct to two
decimals.
Hint: Use the weak law of large numbers.
Solution
Let the random variable X represent the life of an electronic watch.
It is given that X ∼ Exp(1/2) and 50 such samples are taken.
E[X] = µ = 2, Var(X) = σ 2 = 4
To find: a lower bound on P (1 < X < 3).
By the weak law of large numbers, we have

σ2
P (|X − µ| < δ) ≥ 1 −
nδ 2
4
P |X − 2| < 1 ≥ 1 −
50 × 1
23
=
25
= 0.92

11. A university evaluates the final scores of students based on coursework and a final project.
The variance of the coursework scores is 15, and the variance of the final project scores
is 30. Coursework contributes 70% to the final evaluation, while the project contributes
30%. Assuming the scores of coursework and project are uncorrelated, what is the
variance of the final evaluation scores? Enter the answer correct to two decimal places.
Answer : 10.05 ; Range : 10.02 to 10.08
Solution
To compute the variance of the final evaluation scores, we use the formula for the variance
of a linear combination of uncorrelated random variables:

Var(aX + bY ) = a2 Var(X) + b2 Var(Y ),

where X and Y are independent random variables, and a and b are the weights of X
and Y , respectively.
Course: Statistics for Data Science - II Page 8 of 11

In this case:

a = 0.7 (weight of coursework),

b = 0.3 (weight of the project),
Var(X) = 15 (variance of coursework scores),
Var(Y ) = 30 (variance of project scores).

Substitute the values into the formula:

Var(Final Evaluation) = (0.7)2 · 15 + (0.3)2 · 30.

Var(Final Evaluation) = 0.49 · 15 + 0.09 · 30.

Var(Final Evaluation) = 7.35 + 2.7 = 10.05.

Therefore, the variance of the final evaluation scores is 10.05.

12. In a large population of students, an unknown proportion p prefers Chips over Cookies.
A survey was conducted among 200 students and 120 preferred Chips. Assume the 200
samples are i.i.d. Bernoulli (p).
Based on the given information, answer the following questions:
[(i).]What is the sample mean? Enter the answer correct to one decimal place.
Answer : 0.6 To determine the sample mean, we use the formula for the sample
mean:
Number of successes
X̄ =
Total number of samples
Given:

Number of successes = 120 (students who preferred Chips),

Total number of samples = 200 (total students surveyed).

Substituting the values into the formula:

120
X̄ = .
200
Simplify the fraction:
X̄ = 0.6.
Thus, the sample mean is:
0.6 .
What is the variance of the sample mean? Assume p is equal to the sample
mean.[(a)]
2.
1. (a) 0.24
Course: Statistics for Data Science - II Page 9 of 11

(b) 0.36
(c) 0.0018
(d) 0.0012
Answer : d
The variance of the sample mean is given by the formula:

p(1 − p)
Var(X̄) = ,
n
where:

p = proportion of successes (equal to the sample mean),

n = total number of samples.

From the given data:

p = 0.6 (calculated sample mean),

n = 200 (total number of students surveyed).

Substituting the values into the formula:

0.6(1 − 0.6)
Var(X̄) = .
200
Simplify the expression:
0.6 · 0.4
Var(X̄) = .
200

0.24
Var(X̄) = .
200

Var(X̄) = 0.0012.
Thus, the variance of the sample mean is:

0.0012 .

3. If 30 more students join the survey and they all prefer Cookies, what will be the
new sample mean and variance of the sample mean?
[(a)]new sample mean = 0.24, new variance = 0.0012 new sample mean = 0.6,
new variance = 0.0012 new sample mean = 0.6522, new variance = 0.0009 new
sample mean = 0.5217, new variance = 0.00108
Answer : d
Course: Statistics for Data Science - II Page 10 of 11

New Sample Mean

The formula for the sample mean is:
Number of successes
X̄ = .
Total number of samples

From the given data:

Original number of successes = 120,

Original total number of samples = 200.

If 30 more students join and they all prefer Cookies, the updated values are:

New number of successes = 120 (unchanged, as no additional students prefer Chips),

New total number of samples = 200 + 30 = 230.

The new sample mean is:

120
X̄ = .
230
Simplify the fraction:

X̄ = 0.5217 (rounded to 4 decimal places).

Thus, the new sample mean is:

0.52174 .

New Variance of the Sample Mean

The formula for the variance of the sample mean is:
p(1 − p)
Var(X̄) = ,
n
where:

p = new sample mean,

n = new total number of samples.

Substituting the values:

p = 0.52174,
n = 230.

Substitute these into the formula:

0.52174(1 − 0.52174)
Var(X̄) = .
230
Course: Statistics for Data Science - II Page 11 of 11

Simplify the calculation:

0.52174 · 0.47826
Var(X̄) = .
230

0.249998
Var(X̄) = .
230

Var(X̄) = 0.00108 (rounded to 5 decimal places).

Thus, the new variance of the sample mean is:

0.00108 .

Effects of 10 Months of Speed Functional And.94254
No ratings yet
Effects of 10 Months of Speed Functional And.94254
11 pages
Moment Generating Functions
No ratings yet
Moment Generating Functions
7 pages
Formula Sheet - Study Version. - Portfolio Management PDF
No ratings yet
Formula Sheet - Study Version. - Portfolio Management PDF
2 pages
Properties of Sums: Problem Set 1 - Due July 16th ECON 139/239 2010 Summer Term II
No ratings yet
Properties of Sums: Problem Set 1 - Due July 16th ECON 139/239 2010 Summer Term II
17 pages
Stats 2 Week 7 GA
No ratings yet
Stats 2 Week 7 GA
6 pages
W7PS
No ratings yet
W7PS
6 pages
Answer: MN 2
No ratings yet
Answer: MN 2
8 pages
Cosm Unit II
No ratings yet
Cosm Unit II
39 pages
Week 5-8 Short Notes
No ratings yet
Week 5-8 Short Notes
10 pages
hw09-3077-fa24_soln
No ratings yet
hw09-3077-fa24_soln
4 pages
Random Variables1d and 2D
No ratings yet
Random Variables1d and 2D
31 pages
323 egec
No ratings yet
323 egec
18 pages
Inbound 612314967560352381
No ratings yet
Inbound 612314967560352381
10 pages
Single - Random - Variable
No ratings yet
Single - Random - Variable
13 pages
Practice Set_PCCEC503
No ratings yet
Practice Set_PCCEC503
22 pages
PSet1 - Solnb Solutiond
No ratings yet
PSet1 - Solnb Solutiond
10 pages
Further Statistics Chapter 1 pptx
No ratings yet
Further Statistics Chapter 1 pptx
30 pages
Probability and Statistics - Homework
No ratings yet
Probability and Statistics - Homework
9 pages
Basic Probability Reference Sheet: February 27, 2001
No ratings yet
Basic Probability Reference Sheet: February 27, 2001
8 pages
Week 7
No ratings yet
Week 7
2 pages
Central Limit Theorem and Confidence Interval Notes
No ratings yet
Central Limit Theorem and Confidence Interval Notes
11 pages
15 . Probability Distribution ( PDF)
No ratings yet
15 . Probability Distribution ( PDF)
6 pages
160 Gautam Makhija Gautam Makhija 201901161 Scribe 11 1697 540888691
No ratings yet
160 Gautam Makhija Gautam Makhija 201901161 Scribe 11 1697 540888691
12 pages
Mathematical Expectation
No ratings yet
Mathematical Expectation
43 pages
Chapter 4 Mathematical Expectation
No ratings yet
Chapter 4 Mathematical Expectation
28 pages
Is1 Class 1 Group 6 Assignment 3
No ratings yet
Is1 Class 1 Group 6 Assignment 3
12 pages
lec7
No ratings yet
lec7
54 pages
PRCCCCC
No ratings yet
PRCCCCC
4 pages
January 2016B
No ratings yet
January 2016B
7 pages
Random Variable Modified PDF
No ratings yet
Random Variable Modified PDF
19 pages
Expectations 13 Pages
No ratings yet
Expectations 13 Pages
13 pages
MATH 376 - Final Exam Sample Solutions: 1 2 M 1 2 N I 1 2 1 I 2 2 2
No ratings yet
MATH 376 - Final Exam Sample Solutions: 1 2 M 1 2 N I 1 2 1 I 2 2 2
8 pages
Special Discrete Distributions Notes
No ratings yet
Special Discrete Distributions Notes
11 pages
Assignment Prob Theory On CLT and Chebysheve Inequality
No ratings yet
Assignment Prob Theory On CLT and Chebysheve Inequality
4 pages
Sta1610 201 2015 2
100% (1)
Sta1610 201 2015 2
18 pages
PS & PQT - Unit I MCQ
No ratings yet
PS & PQT - Unit I MCQ
18 pages
EDA Lesson 3
No ratings yet
EDA Lesson 3
3 pages
PME-lec7-ch4-a
No ratings yet
PME-lec7-ch4-a
67 pages
Set 4 IBM-322
No ratings yet
Set 4 IBM-322
3 pages
Notes 3 - Variance of a Random Variable
No ratings yet
Notes 3 - Variance of a Random Variable
10 pages
GW3S
No ratings yet
GW3S
12 pages
Parameters: Unless Otherwise Noted, These Formulas Assume
No ratings yet
Parameters: Unless Otherwise Noted, These Formulas Assume
6 pages
WINSEM2024-25_MAT1011_ETH_AP2024254000674_2025-01-22_Reference-Material-I
No ratings yet
WINSEM2024-25_MAT1011_ETH_AP2024254000674_2025-01-22_Reference-Material-I
32 pages
SM 316 - Spring 2019 Homework 4
No ratings yet
SM 316 - Spring 2019 Homework 4
4 pages
W3GS
No ratings yet
W3GS
11 pages
W8PS
No ratings yet
W8PS
13 pages
ISOM2500Practice - Quiz 2 Sol
No ratings yet
ISOM2500Practice - Quiz 2 Sol
5 pages
Random Variables: Complete Business Statistics, 8/e Instructor's Solutions Manual, Chapter 3
No ratings yet
Random Variables: Complete Business Statistics, 8/e Instructor's Solutions Manual, Chapter 3
33 pages
Statistics L2
No ratings yet
Statistics L2
3 pages
Mean and Variance of Random Variables and Probability Distribution Discussion
No ratings yet
Mean and Variance of Random Variables and Probability Distribution Discussion
36 pages
Applied Statistics...
No ratings yet
Applied Statistics...
11 pages
Assignment 1 (2016W)
No ratings yet
Assignment 1 (2016W)
6 pages
FOW9 - SB - Final Mock Test - Answers + Explanation.
No ratings yet
FOW9 - SB - Final Mock Test - Answers + Explanation.
38 pages
Statistics 8
No ratings yet
Statistics 8
3 pages
Discrete Random Variable
No ratings yet
Discrete Random Variable
41 pages
03 Stat2 Exercise Set3 Solutions
No ratings yet
03 Stat2 Exercise Set3 Solutions
6 pages
STA301 Imp Formulas & Definitions_ff
No ratings yet
STA301 Imp Formulas & Definitions_ff
28 pages
AE 248: AI and Data Science: Prabhu Ramachandran 2024-01-01
No ratings yet
AE 248: AI and Data Science: Prabhu Ramachandran 2024-01-01
12 pages
3 Expectation
No ratings yet
3 Expectation
70 pages
Algebraic Equations
From Everand
Algebraic Equations
Demetrios P. Kanoussis
No ratings yet
Worked Examples in Mathematics for Scientists and Engineers
From Everand
Worked Examples in Mathematics for Scientists and Engineers
G. Stephenson
No ratings yet
De Moiver's Theorem (Trigonometry) Mathematics Question Bank
From Everand
De Moiver's Theorem (Trigonometry) Mathematics Question Bank
Mohmmad Khaja Shareef
No ratings yet
10+2 Level Mathematics For All Exams GMAT, GRE, CAT, SAT, ACT, IIT JEE, WBJEE, ISI, CMI, RMO, INMO, KVPY Etc.
From Everand
10+2 Level Mathematics For All Exams GMAT, GRE, CAT, SAT, ACT, IIT JEE, WBJEE, ISI, CMI, RMO, INMO, KVPY Etc.
Shubhankar Paul
No ratings yet
Week-6_GA_Solution_1
No ratings yet
Week-6_GA_Solution_1
23 pages
240320100352_1708607813
No ratings yet
240320100352_1708607813
3 pages
Economic Survey 2023f
No ratings yet
Economic Survey 2023f
327 pages
LECTURE 2
No ratings yet
LECTURE 2
9 pages
CH 5 Flexible Budget-7
No ratings yet
CH 5 Flexible Budget-7
23 pages
1.linear Regression PSP
No ratings yet
1.linear Regression PSP
92 pages
MTH262 - Statistics & Probability Theory by Dr. Riffat Jabeen
0% (1)
MTH262 - Statistics & Probability Theory by Dr. Riffat Jabeen
5 pages
Application of Extreme Value Statistics To Corrosion
No ratings yet
Application of Extreme Value Statistics To Corrosion
10 pages
The Relationship Between Time Management Work Stress and Work Performancea Quantitative Study in Portugal
No ratings yet
The Relationship Between Time Management Work Stress and Work Performancea Quantitative Study in Portugal
12 pages
Statistics: Descriptive Statistics Inferntial Statistics
No ratings yet
Statistics: Descriptive Statistics Inferntial Statistics
5 pages
Blundell Bond 1998
No ratings yet
Blundell Bond 1998
29 pages
Midterm Examination in Statistics and Probability: For Numbers: 6 - 8, Given The Table
No ratings yet
Midterm Examination in Statistics and Probability: For Numbers: 6 - 8, Given The Table
4 pages
Pune PDF
No ratings yet
Pune PDF
74 pages
The Comparison of Value at Risk On Sharia Based Stock and Non-Sharia Based Stock
No ratings yet
The Comparison of Value at Risk On Sharia Based Stock and Non-Sharia Based Stock
13 pages
Inference MCQs AABR
No ratings yet
Inference MCQs AABR
8 pages
Book Studies
No ratings yet
Book Studies
796 pages
Numerical Descriptive Measures: Prem Mann, Introductory Statistics, 7/E
No ratings yet
Numerical Descriptive Measures: Prem Mann, Introductory Statistics, 7/E
138 pages
Measurement System Analysis Lab
100% (1)
Measurement System Analysis Lab
32 pages
Special Probability Distributions - Applied Statistics and Probability
No ratings yet
Special Probability Distributions - Applied Statistics and Probability
45 pages
NCSU Course Syllabus - ST 515 - 001 - Experimental Statistics For Engineers I
No ratings yet
NCSU Course Syllabus - ST 515 - 001 - Experimental Statistics For Engineers I
7 pages
Full Download Robust statistics 2° Edition Peter J. Huber PDF DOCX
100% (1)
Full Download Robust statistics 2° Edition Peter J. Huber PDF DOCX
51 pages
The Mediating Role of Speed in The Global S 2020 Journal of Purchasing and S
No ratings yet
The Mediating Role of Speed in The Global S 2020 Journal of Purchasing and S
11 pages
Lin Et Al (2013) Exact CS, BSSA
No ratings yet
Lin Et Al (2013) Exact CS, BSSA
14 pages
Computers & Education: M. Dolores Gallego, Salvador Bueno, Jan Noyes
No ratings yet
Computers & Education: M. Dolores Gallego, Salvador Bueno, Jan Noyes
13 pages
AP-Research Methods & Statistics-NEW
No ratings yet
AP-Research Methods & Statistics-NEW
5 pages
Stat 110 Syllabus
No ratings yet
Stat 110 Syllabus
4 pages
Plackett RL, Burman, JP. (1946) The Design of
No ratings yet
Plackett RL, Burman, JP. (1946) The Design of
17 pages
Measures of Variation Include
No ratings yet
Measures of Variation Include
23 pages
Vector and Scalar Quantities
No ratings yet
Vector and Scalar Quantities
38 pages
Stat and Prob Q1 W2
No ratings yet
Stat and Prob Q1 W2
4 pages