0% found this document useful (0 votes)

12 views36 pages

확통1 LectureNote06 on Limit Theorems

Uploaded by

jedem10224

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

12 views36 pages

확통1 LectureNote06 on Limit Theorems

Uploaded by

jedem10224

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 36

Limit Theorems

Limit Theorems: Motivation

𝑋1 , ⋯ , 𝑋𝑛 are i.i.d. random variables.
Let
𝑋1 +⋯+𝑋𝑛
𝑀𝑛 = .
𝑛
What happens to 𝑀𝑛 as 𝑛 → ∞ ?

• A tool: Several inequalities in probability

• Convergence “in probability”

• Convergence “with probability 1”

.
2
Markov Inequality
• For a nonnegative random variable 𝑋, 𝑓𝑋 (𝑥)

𝐄[𝑋]
P 𝑋≥𝑎 ≤ for all 𝑎 > 0
𝑎

0, if 𝑋 < 𝑎
• Why? : Let 𝑌𝑎 = ቊ 𝑃(𝑌𝑎 = 0)

𝑎, if 𝑋 ≥ 𝑎 𝑃(𝑌𝑎 = 𝑎)

Then, 𝑌𝑎 ≤ 𝑋 and 𝐄 𝑌𝑎 ≤ 𝐄[𝑋]. 0 𝑎

On the other hand, 𝐄 𝑌𝑎 = 𝑎𝐏 𝑌𝑎 = 𝑎 = 𝑎𝐏 𝑋 ≥ 𝑎 ,

from which we get the result.
3
Generalized Markov Inequality
• We now have for a nonnegative random variable 𝑋,
𝐄𝑋
P 𝑋≥𝑎 ≤ for all 𝑎 > 0
𝑎
• Next, we can generalize the Markov inequality. We can
substitute any positive non-decreasing function 𝑓: 𝑋 →
ℝ+ :
𝐄[𝑓(𝑋)]
P 𝑋 ≥ 𝑎 = 𝐏(𝑓 𝑋 ≥ 𝑓 𝑎 ) ≤
𝑓(𝑎)

⚫ If we pick 𝑓(𝑋) judiciously we can obtain better bounds.

4
Chebyshev Inequality
• For a random variable 𝑋 with mean 𝐄[𝑋] and variance 𝜎𝑋2 ,
𝜎𝑋2
𝐏 𝑋−𝐄 𝑋 ≥𝑐 ≤ for all 𝑐 > 0
𝑐2
• Why? : As a first application of the generalized Markov bound,
we pick 𝑓 𝑋 = 𝑋 2 . Then,
𝐏 𝑋 − 𝐄 𝑋 ≥ 𝑐 = 𝐏 𝑋 − 𝐄 𝑋 2 ≥ 𝑐2
𝐄 𝑋−𝐄 𝑋 2 𝜎𝑋2
≤ 2
= 2
𝑐 𝑐
⚫ For 𝑐 = 𝑘𝜎,
1
𝐏(|𝑋 − 𝐄 𝑋 | ≥ 𝑘𝜎𝑋 ) ≤ 2
𝑘

5
Example: Chebyshev bound is conservative
• The Chebyshev bound is more powerful than the Markov bound,
because it also uses variance. But since the mean and variance are
only a rough summary of its properties, we cannot expect the
bound to be close approximation to the exact value.
2 4−0 2ൗ
• If 𝑋~𝑈[0,4], we 𝐄 X = 2, 𝜎𝑋 ≤ 12 = 4/3, and for 𝑐 = 1
4
𝐏 𝑋−2 ≥1 ≤ .
3
which is uninformative compared to the exact value 1/2.
• Let 𝑋~Exp(𝜆 = 1), so that 𝐄 X = 1, 𝜎𝑋2 = 1. For 𝑐 > 1,
𝐏 𝑋 ≥𝑐 =𝐏 𝑋−1≥𝑐−1
1
≤𝐏 𝑋−1 ≥𝑐−1 ≤ 2
,
𝑐−1
which is again conservative compared to the exact value
𝐏 𝑋 > 𝑐 = 𝑒 −𝑐 .
6
Example: Upper bound of Chebyshev Ineq.
• If 𝑋 is in [𝑎, 𝑏], we claim a conservative bound 𝜎𝑋2 ≤ 𝑏 − 𝑎 2 /4.
If 𝜎𝑋2 is unknown, we may use 𝜎𝑋2 = 𝑏 − 𝑎 2 /4, and claim
𝑏−𝑎 2
𝐏(|𝑋 − 𝐄 𝑋 | ≥ 𝑐) ≤
4𝑐 2
• Why? : For any constant 𝛾, we have
𝐄 (𝑋 − 𝛾)2 = 𝐄 𝑋 2 − 2𝐄 𝑋 𝛾 + 𝛾 2 ,
and this is minimized when 𝛾 = 𝐄 𝑋 . Thus,
2
𝐄 (𝑋 − 𝛾)2 ≥𝐄 𝑋−𝐄 𝑋 = 𝜎𝑋2 , for all 𝛾.
By setting 𝛾 = (𝑎 + 𝑏)/2, we have
2 2 2
𝑎+𝑏 𝑏−𝑎 𝑏−𝑎
𝜎𝑋2 ≤ 𝐄 𝑋− =𝐄 𝑋−𝑎 𝑋−𝑏 + ≤
2 4 4
where the last inequality follows (𝑥 − 𝑎)(𝑥 − 𝑏) ≤ 0 for all 𝑥 in
the range [𝑎, 𝑏].
7
Chernoff Bound (1)
• Chernoff bounds are typically (but not always) tighter than
Markov and Chebyshev bounds but require stronger assumptions.
Let 𝑋 be a sum of 𝑛 independent Bernoulli random variables {𝑋𝑖 },
𝑋 = σ𝑖 𝑋𝑖 with 𝐄[𝑋𝑖 ] = 𝑝𝑖 . Let 𝜇 = 𝐄 𝑋 . Then we have
𝜇 = 𝐄 𝑋 = 𝐄 ෍ 𝑋𝑖 = ෍ 𝐄[𝑋𝑖 ] = ෍ 𝑝𝑖
𝑖 𝑖 𝑖
• We pick 𝑓 𝑋 = 𝑒 𝑡𝑋 . Then,
𝐄 𝑒 𝑡𝑋
𝐏[𝑋 ≥ 1 + 𝛿 𝜇] = 𝐏 𝑒 𝑡𝑋 ≥ 𝑒 1+𝛿 𝜇𝑡 ≤ 1+𝛿 𝜇𝑡 (1)
𝑒
⚫ We will establish a bound on 𝐄 𝑒 𝑡𝑋 :

𝐄 𝑒 𝑡𝑋 = 𝐄 𝑒 𝑡 σ 𝑋𝑖 = 𝐄 ෑ 𝑒 𝑡𝑋𝑖 = ෑ 𝐄[𝑒 𝑡𝑋𝑖 ]

= ෑ(𝑝𝑖 𝑒 𝑡 + (1 − 𝑝𝑖 ) ∙ 1) = ෑ(1 + 𝑝𝑖 (𝑒 𝑡 − 1)) 8

Chernoff Bound (2)
• We now use the following approximation, for 𝑦 ∈ ℝ, 1 + 𝑦 ≤ 𝑒 𝑦 .
Hence, regarding 𝑦 as 𝑝𝑖 (𝑒 𝑡 − 1)
𝑡 −1)
𝐄 𝑒 𝑡𝑋 = ෑ(1 + 𝑝𝑖 (𝑒 𝑡 − 1)) ≤ ෑ𝑒 𝑝 𝑖 (𝑒

σ 𝑝𝑖 (𝑒 𝑡 −1) (𝑒 𝑡 −1) σ 𝑝 (𝑒 𝑡 −1)𝜇

=𝑒 = 𝑒 = 𝑖 𝑒
Substituting this into Eq.(1), we get that for all 𝑡 ≥ 0,
𝑒 𝑡 −1 𝜇
𝑒
𝐏 𝑋 ≥ 1 + 𝛿 𝜇 ≤ 1+𝛿 𝜇𝑡 (2)
𝑒
⚫ In order to make the bound as tight to as possible, we find the
value of 𝑡 that minimizes the upper bound of eq.(2), 𝑡 = ln 1 + 𝛿 .
Substituting this into eq.(2), we obtain, for all 𝛿 ≥ 0:
𝑒 ln 1+𝛿 −1 𝜇− 1+𝛿 ln 1+𝛿 𝜇
𝐏 𝑋 ≥ 1+𝛿 𝜇 ≤𝑒
= 𝑒 [𝛿− 1+𝛿 ln 1+𝛿 ]𝜇 (3)
9
Chernoff Bound (3)
⚫ We will now try to obtain a simpler form of the above bound. In
particular, we use the Taylor series expansion of ln 1 + 𝛿 given by
𝑖+1 𝛿𝑖
ln 1 + 𝛿 = σ𝑖≥1(−1) ∙ . Therefore
𝑖
𝑖𝛿𝑖(
1 1
1 + 𝛿 ln 1 + 𝛿 = 𝛿 + ෍ −1 − )
𝑖≥2 𝑖−1 𝑖
Assuming that 0 < 𝛿 < 1, and ignoring the higher order terms,
𝛿2 𝛿3 𝛿2
1 + 𝛿 ln 1 + 𝛿 > 𝛿 + − >𝛿+
2 6 3
Plugging this into eq.(3), we obtain
−𝛿 2 𝜇
𝐏 𝑋 ≥ 1+𝛿 𝜇 ≤ 𝑒 3 (0 < 𝛿 < 1)

⚫ A very similar calculation shows that

−𝛿 2 𝜇
𝐏 𝑋 < 1−𝛿 𝜇 ≤ 𝑒 2 (0 < 𝛿 < 1) 10
A More General Chernoff Bound
2𝛿
⚫ We observe that ln 1 + 𝛿 > for 𝛿 > 0. This implies that
2+𝛿
−𝛿 2
𝛿 − 1 + 𝛿 ln 1 + 𝛿 ≤ .
2+𝛿
Hence, using eq.(3) we obtain the following bound, which works
for all positive 𝛿 ,
−𝛿 2 𝜇
𝐏 𝑋 ≥ 1+𝛿 𝜇 ≤ 𝑒 2+𝛿 (𝛿 > 0)
Similarly, it can be shown that
−𝛿 2 𝜇
𝐏 𝑋 < 1−𝛿 𝜇 ≤ 𝑒 2+𝛿 (𝛿 > 0)
⚫ We can combine both inequalities into one called two-sided
Chernoff bound
𝛿2𝜇
−
𝐏 |𝑋 − 𝜇| ≥ 𝛿𝜇 ≤ 2𝑒 2+𝛿 (𝛿 > 0)
11
Example: Fair Coin Tossing
• Suppose you toss a fair coin 200 times. How likely is it that you
see at least 120 heads?
𝑛
First note that 𝜇 = = 100, and from 120 = 1 + 𝛿 𝜇, we see
2
𝛿 = 0.2. Then the Chernoff bound says
0.22 ×100 20
𝐏 𝑋 ≥ 120 ≤ 𝑒 − 2+0.2 =𝑒 −11
= 0.162
⚫ Let us compare this with the Chebyshev bound. Note that 𝜎 2 =
𝑛
= 50, and from 120 = 1 + 𝛿 𝜇, we see 𝜇𝛿 = 20. Then the
4
Chebyshev bound is
𝜎2 50
𝐏 𝑋 ≥ 120 ≤ 2
= 2 = 0.125
(𝜇𝛿) 20
This result shows that the Chernoff bound is not always tighter
than the Chebyshev bound.
12
Convergence of a deterministic sequence

• We have a sequence of real numbers 𝑎𝑛 and a number 𝑎

• We say that 𝑎𝑛 converges to 𝑎 and write lim 𝑎𝑛 = 𝑎,

𝑛→∞
− if (intuitively): 𝑎𝑛 eventually gets and stays
(arbitrarily) close to 𝑎

− if (rigorously): For every 𝜖 > 0, there exists

some 𝑛0 such that for all 𝑛 ≥ 𝑛0 ,
𝑎𝑛 − 𝑎 < 𝜖

13
Convergence “in probability”

• We have a sequence of random variables 𝑌𝑛

• We say that 𝑌𝑛 converges to a number 𝑎 in probability,
− if (intuitively): “Almost all” of the PMF/PDF of 𝑌𝑛
eventually gets concentrated (arbitrarily)
close to 𝑎

− if (rigorously): For every 𝜖 > 0, we have

lim 𝐏 𝑌𝑛 − 𝑎 < 𝜖 = 1
𝑛→∞

14
Example: Convergence
• One might be tempted to believe that if a sequence 𝑌𝑛
converges to a number 𝑎, then 𝐄[𝑌𝑛 ] must also to 𝑎. The
following example shows this need not be a case.
• Consider a sequence of random variables with the following
sequence of PMFs: 𝑷𝒀 (𝒚)
𝒏

1 − 𝑛1 , 𝑦=0 𝟏−
𝟏
𝐏 𝑌𝑛 = 𝑦 = ቐ 1
𝒏

𝑛, 𝑦 = 𝑛2 𝟏
𝒏
• For every 𝜖 > 0, we have 𝟎 𝒏𝟐 𝒀𝒏
1
lim 𝐏 𝑌𝑛 − 0 ≥ 𝜖 = lim = 0.
𝑛→∞ 𝑛→∞ 𝑛
Thus, 𝑌𝑛 converges to 0 in probability.
1
• 𝐄 𝑌𝑛 = 𝑛2 × = 𝑛, which goes to ∞ as 𝑛 increases.
𝑛
15
Convergence “with probability 1” (1)
• We have a sequence of random variables 𝑌1 , 𝑌2 , 𝑌3 , …
(not necessarily i.i.d.)
• We say that 𝑌𝑛 converges to 𝑎 with probability 1 (wp1)
(or almost surely (a.s.)) if

𝐏( lim 𝑌𝑛 = 𝑎) = 1
𝑛→∞

• Convergence with probability 1 implies convergence in

probability, but the converse is not necessarily true

16
Convergence “with probability 1” (2)
• Consider a sequence 𝑌1 , 𝑌2 , 𝑌3 , … . If for all 𝜖 > 0, we have
∞

෍ 𝐏 𝑌𝑛 − 𝑎 > 𝜖 < ∞,
𝑛=1
a.s.
then 𝑌𝑛 𝑎. This provides only a sufficient condition for
almost sure convergence.
• In the case σ∞
𝑛=1 𝐏 𝑌𝑛 − 𝑎 > 𝜖 = ∞, then we have a
necessary and sufficient condition for almost surely
convergence: Define the set of events
𝑆𝑚 = 𝑌𝑛 − 𝑎 < 𝜖, for all 𝑛 ≥ 𝑚 .
a.s.
Then, 𝑌𝑛 𝑎 if and only if for any 𝜖 > 0, we have

lim 𝐏 𝑆𝑚 = lim 𝐏 𝑌𝑛 − 𝑎 < 𝜖, for all 𝑛 ≥ 𝑚 = 1

𝑚→∞ 𝑚→∞
17
Convergence “with probability 1” (3)
• Example: Let 𝑋1 , … , 𝑋𝑛 be i.i.d. Bernoulli(1/2), and
define 𝑌𝑛 = 2𝑛 ς𝑛𝑖=1 𝑋𝑖 . Then for any 0 < 𝜖 < 2𝑛 ,
𝐏{ 𝑌𝑛 − 0 < 𝜖 for all 𝑛 ≥ 𝑚}
= 𝐏{𝑋𝑛 = 0 for some 𝑛 ≤ 𝑚}
= 1 − 𝐏{𝑋𝑛 = 1 for all 𝑛 ≤ 𝑚}
= 1 − 1Τ2 𝑚 ,
which converges to 1 as 𝑚 → ∞. Hence, the sequence 𝑌𝑛
converges to 0 almost surely.
• Exercise: Let 𝑋𝑛 be i.i.d. Bernoulli(1/𝑛) rvs for 𝑛 =
a.s.
2,3, … . The goal is to check whether 𝑋𝑛 0.
(a) Check that σ∞𝑛=1 𝐏 𝑋𝑛 − 0 > 𝜖 = ∞ .
(b) Show that 𝑋𝑛 does not converge to 0 almost surely.
18
Convergence of Sample Mean
• Let 𝑋1 , ⋯ , 𝑋𝑛 be i.i.d. rvs with mean 𝜇 and variance
𝜎 2 and the sample mean is
𝑋1 + 𝑋2 + ⋯ 𝑋𝑛
𝑀𝑛 =
𝑛
• Mean: 𝐄 𝑀𝑛 = 𝜇
𝜎2
• Variance: 𝐕 𝑀𝑛 =
𝑛
𝐕 𝑀𝑛
• Chebyshev: 𝐏( 𝑀𝑛 − 𝐄 𝑀𝑛 ≥ ϵ) ≤
𝜖2

𝜎2
𝐏( 𝑀𝑛 − 𝜇 ≥ 𝜖) ≤ 2
𝑛𝜖 19
WLLN and SLLN
• Let 𝑋1 , ⋯ , 𝑋𝑛 be i.i.d. with finite mean 𝜇 and variance 𝜎 2
• Weak Law of Large Numbers (WLLN)
For every 𝜖 > 0, 𝑀𝑛 converges to 𝜇 in probability

𝐏 |𝑀𝑛 − 𝜇| ≥ 𝜖 → 0 , as 𝑛 → ∞

• Strong Law of Large Numbers (SLLN)

𝑀𝑛 converges to 𝜇 with probability 1, in the sense that

𝐏 lim 𝑀𝑛 = 𝜇 = 1
𝑛→∞

20
The Pollster’s Problem (1)
• 𝑝: proportion of population that do something
1 if "Yes"
• 𝑖th person polled ~ Bernoulli(𝑝): 𝑋𝑖 = ቊ
0 if "No"
σ𝑛
• 𝑀𝑛 = 𝑖=1 𝑋𝑖
𝑛
= sample proportion of “Yes” as our estimate of 𝑝
• How many persons should be polled to satisfy
𝐏(|𝑀𝑛 − 𝑝| ≥ 0.01) ≤ 0.05
• Chebyshev bound is 𝐏 𝑀𝑛 − 𝐄 𝑀𝑛 ≥ 𝜖 ≤ 2 .
𝐕 𝑀𝑛
𝜖
𝑝(1−𝑝) 1
We have ϵ = 0.01, 𝐄 𝑀𝑛 = 𝑝, 𝐕 𝑀𝑛 = ≤
𝑛 4𝑛
1 (∵)When 𝑿 takes values
Thus, 𝐏(|𝑀𝑛 − 𝑝| ≥ 0.01) ≤ 2 ≤ 0.05 in [𝒂,𝟐𝒃], 𝝈𝟐𝑿 ≤ 𝒃 − 𝒂 𝟐/𝟒.
4𝑛 0.01 So 𝝈𝑿 = 𝒑(𝟏 − 𝒑) ≤ 𝟏/𝟒
• If we choose 𝑛 large enough to satisfy the above bound,
we have a conservative value of 𝑛 ≥ 50,000
21
Central Limit Theorem (1)
• Let 𝑋1 , ⋯ , 𝑋𝑛 be a sequence of i.i.d. rvs with finite mean
𝜇 and variance 𝜎 2
• Look at three variants of their sum:
− 𝑆𝑛 = 𝑋1 + ⋯ + 𝑋𝑛 (variance 𝑛𝜎 2 ) increases to ∞
− 𝑀𝑛 = 𝑆𝑛 /𝑛 (variance 𝜎 2 /𝑛) converges “in probability”
to 𝜇 from WLLN
− 𝑆𝑛 / 𝑛 has the variance at a constant level 𝜎 2
• We define a “standardized” sum
𝑀𝑛 − 𝐄(𝑀𝑛 ) 𝑀𝑛 − 𝜇 𝑛𝑀𝑛 − 𝑛𝜇 𝑆𝑛 − 𝑛𝜇
𝑍𝑛 = = = =
𝜎𝑀𝑛 𝜎/ 𝑛 𝜎 𝑛 𝜎 𝑛
from which 𝐄 𝑍𝑛 = 0 and 𝐕(𝑍𝑛 ) = 1
22
Central Limit Theorem (2)
• Then, the CDF of 𝑍𝑛 converges to the standard normal CDF
in the sense that
lim 𝐏(𝑍𝑛 ≤ 𝑧) = Φ(𝑧), for every 𝑧
𝑛→∞
where Φ(𝑧) is the standard normal CDF
𝑧
1 −𝑥 2 /2
Φ 𝑧 = න𝑒 𝑑𝑥
2𝜋
−∞
• This is called the Central Limit Theorem (CLT).

23
What exactly does the CLT say?
• CDF of 𝑍𝑛 converges to Φ(𝑧)
− Not a statement about convergence of PDFs or PMFs.

• Normal Approximation:
− Treat 𝑍𝑛 as if normal (CLT)
− Also treat 𝑆𝑛 as if normal (NA)

• Can we use it when 𝑛 is “moderate” ?

− Yes, but no nice theorems about the value of 𝑛

24
Normal Approximation based on CLT
• If 𝑛 is large, the probability 𝐏(𝑆𝑛 ≤ 𝑠) can be
approximated by treating 𝑆𝑛 as if it were normal,
according to the following procedure:

1. Calculate the mean 𝑛𝜇 and the variance 𝑛𝜎 2 of 𝑆𝑛 .

2. Calculate the normalized value 𝑧 = (𝑠 − 𝑛𝜇)/𝜎 𝑛.
3. Use the approximation
𝐏 𝑆𝑛 ≤ 𝑠 ≈ 𝐏 𝑍𝑛 ≤ 𝑧 = Φ 𝑧 ,
where Φ 𝑧 is available from the standard normal
CDF table.

25
Example: CLT (1)
• We load on a plane 100 packages whose weights are i.i.d. rvs
that are uniformly distributed between 5 and 50 kgs. What is
𝐏(𝑆100 > 3000 kgs) ?
5+50 (50−5)2
• 𝜇= = 27.5, 𝜎 2 = = 168.75
2 12
3000 − 100 × 27.5
𝑧= = 1.92
100 × 168.75
Use the standard normal tables to get the approximation
𝐏(𝑆100 ≤ 3000) ≈ Φ 1.92 = 0.9726.
Thus, the desired probability is
𝐏 𝑆100 > 3000 = 1 − 𝐏 𝑆100 ≤ 3000
≈ 1 − 0.9726 = 0.0274.

26
Example: CLT (2)
• The production times of machine parts are i.i.d. rvs, uniformly
distributed in [1, 5] minutes. What is the probability that the
number of parts produced within 320 minutes, 𝑁320 , is at least
100?
• Let 𝑋𝑖 be the processing time of the 𝑖th part and let 𝑆100 be the
total processing time of the 100 parts. Note that the event {𝑁320 ≥
100} is the same as the event {𝑆100 ≤ 320}.
1+5 (5−1)2 320−100×3
𝜇= = 3, 𝜎2 = = 4/3, 𝑧 = = 1.73
2 12 100×4/3
Thus, the desired probability is
𝐏 𝑁320 ≥ 100 = 𝐏 𝑆100 ≤ 320 ≈ Φ 1.73 = 0.9582.
n n+1

𝑆𝑛 ⇒ 𝑁𝑡 ≥ 𝑛 = {𝑆𝑛 ≤ 𝑡}
27
𝑁𝑡 events
Continuity Correction (1)
• Let us assume that 𝑌~Bin(𝑛 = 20, 𝑝 = 1/2), and suppose that
we are interested in 𝐏(8 ≤ 𝑌 ≤ 10). Then,
𝑌 = 𝑋1 + ⋯ + 𝑋𝑛 with 𝑋𝑖 ~Bernoulli 𝑝 = 1/2 .
• We can apply the CLT to approximate
8 − 𝑛𝜇 𝑌 − 𝑛𝜇 10 − 𝑛𝜇
𝐏 8 ≤ 𝑌 ≤ 10 = 𝐏 ≤ ≤
𝜎 𝑛 𝜎 𝑛 𝜎 𝑛
8 − 10 10 − 10
=𝐏 ≤𝑍≤
5 5
2
≈Φ 0 −Φ − = 0.3145
5
• We can also find the exact value
10 𝑘 20−𝑘
20 1 1
𝐏 8 ≤ 𝑌 ≤ 10 = ෍ 1− = 0.4565
𝑘=8 𝑘 2 2 28
Continuity Correction (2)
• We notice that our approximation is not good. Part of the error
comes from the fact that 𝑌 is a discrete rv and we are using a
continuous distribution. Here is a trick to get a better result, called
continuity correction.
• Since 𝑌 can only take integer values, we can write
𝐏 8 ≤ 𝑌 ≤ 10 = 𝐏(7.5 ≤ 𝑌 ≤ 10.5)
7.5 − 10 𝑌 − 𝑛𝜇 10.5 − 10
= 𝐏( ≤ ≤ )
5 𝜎 𝑛 5
0.5 2.5
≈Φ −Φ − = 0.4567
5 5
• As we can see, our approximation improved significantly. The
continuity correction is particularly useful when we use the
normal approximation to the binomial distribution.
29
Continuity Correction (3)
𝑌 is at least 8 = {𝑌 ≥ 8}
(includes 8 and above)

𝑌 is more than 8 = {𝑌 > 8}

(doesn’t include 8)

𝑌 is at most 8 = {𝑌 ≤ 8}
(includes 8 and below)

𝑌 is fewer than 8 = {𝑌 < 8}

(doesn’t include 8)

𝑌 is exactly 8 = {𝑌 = 8}
30
The Pollster’s Problem (2)
• Suppose we want 𝐏(|𝑀𝑛 − 𝑝| ≥ 0.01) ≤ 0.05 with 𝐄 𝑆𝑛 =
𝑛𝑝, 𝜎𝑆2𝑛 = 𝑛𝜎 2 and 𝜎 2 ≤ 1/4 (∵)Since 𝒑 takes values
in [𝟎, 𝟏], 𝝈𝟐 ≤ 𝟏 − 𝟎 𝟐 /𝟒
𝒑

• Event of interest: |𝑀𝑛 − 𝑝| ≥ 0.01

𝑋1 + ⋯ 𝑋𝑛 − 𝑛𝑝
≥ 0.01
𝑛
𝑋1 + ⋯ 𝑋𝑛 − 𝑛𝑝 0.01 𝑛
≥
𝜎 𝑛 𝜎
𝑍𝑛 ≥ 0.01 𝑛/𝜎 ≈ |𝑍| ≥ 0.01 𝑛/𝜎
⇒ 𝐏 |𝑀𝑛 − 𝑝| ≥ 0.01 ≈ 𝐏 |𝑍| ≥ 0.01 𝑛/𝜎
• Obtain the upper bound on 𝑍 by assuming that 𝑝 has the largest
possible variance, 𝜎 2 = 1/4, which corresponds to 𝑝 = 1/2.
⇒ 𝐏 |𝑀𝑛 − 𝑝| ≥ 0.01 ≈ 𝐏 |𝑍| ≥ 0.02 𝑛
31
The Pollster’s Problem (3)
• How large a sample size n is needed if we want
𝐏(|𝑀𝑛 − 𝑝| ≥ 0.01) ≤ 0.05 ?
⇒ 𝐏 𝑀𝑛 − 𝑝 ≥ 0.01 ≈ 𝐏 𝑍 ≥ 0.02 𝑛
= 2 − 2𝐏(𝑍 ≤ 0.02 𝑛) = 2 − 2Φ(0.02 𝑛) ≤ 0.05,
or
Φ(0.02 𝑛) ≥ 0.975
• From the standard normal table, Φ 1.96 = 0.975
1.962
0.02 𝑛 ≥ 1.96 or 𝑛 ≥ = 9604
0.022

• Compare to 𝑛 ≥ 50,000 that we derived using the

Chebyshev’s inequality
32
Usefulness of the CLT
• Only means and variances matter
• Much more accurate than Chebyshev’s inequality
• Useful computational shortcut, even if we have a
formula for the distribution of 𝑆𝑛
• Justification of models involving normal rvs
− Noise in electrical components
− Motion of a particle suspended in a fluid (Brownian
motion)

33
CLT Summary

• 𝑋1 , ⋯ , 𝑋𝑛 are i.i.d. with finite 𝜇 and 𝜎 2

• 𝑆𝑛 = 𝑋1 + ⋯ + 𝑋𝑛 with mean 𝑛𝜇 and variance 𝑛𝜎 2
𝑆𝑛 −𝑛𝜇
• 𝑍𝑛 = →𝑍
𝜎 𝑛
where 𝑍 is standard normal (zero mean, unit variance)
• CLT: For every 𝑐, 𝐏 𝑍𝑛 ≤ 𝑐 → 𝐏 𝑍 ≤ 𝑐 = Φ(𝑐)
• Normal approximation: Treat 𝑆𝑛 as if normal

34
Proof of the CLT
• Assume for simplicity 𝜇 = 𝐄 𝑋 = 0, 𝐄 𝑋 2 = 𝜎 2 = 1
(𝑋1 +𝑋2 +⋯+𝑋𝑛 )
• We want to show that 𝑍𝑛 = converges to
𝑛
the standard normal, or equivalently show the MGF of 𝑍𝑛
tends to that of the standard normal distribution:
𝑀𝑍𝑛 𝑠 = 𝐄 𝑒 𝑠𝑍𝑛 = 𝐄[𝑒 (𝑠Τ 𝑛)(𝑋1+⋯+𝑋𝑛 ) ]
𝑠 𝑠 2 𝑠 2
𝐄 𝑒 𝑠𝑋Τ 𝑛 ≈ 1 + 𝐄𝑋 + 𝐄 𝑋2 ≈ 1 +
𝑛 2𝑛 2𝑛
𝑛 𝑛
𝑠2 2 /2
Thus, 𝑀𝑍𝑛 𝑠 = 𝐄[𝑒 𝑠𝑋Τ 𝑛 ] ≈ 1+ → 𝑒 𝑠 ,
2𝑛
which is the MGF of the standard normal distribution.

Note) MGF of 𝑁 𝜇, 𝜎 2 is exp(𝜇𝑠 + 𝜎 2 𝑠 2 Τ2) 35

Homework #6
Textbook “Introduction to Probability”, 2nd Edition, D. Bertsekas and J. Tsitsiklis
Chapter5 p.284-p.294, Problems 1, 4, 5, 8, 9, 10, 11
Due date: 아주BB 과제출제 확인

Maths Project
67% (3)
Maths Project
21 pages
Stochastic Calculus (EPFL)
No ratings yet
Stochastic Calculus (EPFL)
44 pages
MTH2222 Mathematics of Uncertainty
No ratings yet
MTH2222 Mathematics of Uncertainty
96 pages
Lec 8
No ratings yet
Lec 8
13 pages
ee5110-lecture-limit-theorems
No ratings yet
ee5110-lecture-limit-theorems
9 pages
Limiting Distributions
No ratings yet
Limiting Distributions
10 pages
Unit3-Probability and Stochastic Processes (18MAB203T)
No ratings yet
Unit3-Probability and Stochastic Processes (18MAB203T)
25 pages
Problems: MN) MN
No ratings yet
Problems: MN) MN
11 pages
1 Upper Bounds On The Tail Probability
No ratings yet
1 Upper Bounds On The Tail Probability
7 pages
9 CLT
No ratings yet
9 CLT
19 pages
Unit 3 - Bounds and Inequalities
No ratings yet
Unit 3 - Bounds and Inequalities
25 pages
Introduction To Probability Theory
No ratings yet
Introduction To Probability Theory
13 pages
MTL106 - by Amaiya Singhal (PROBABILITY AND STOCHASTIC PROCESSES)
No ratings yet
MTL106 - by Amaiya Singhal (PROBABILITY AND STOCHASTIC PROCESSES)
53 pages
Probability and Stochastic Process 51
No ratings yet
Probability and Stochastic Process 51
11 pages
Osobine Var
No ratings yet
Osobine Var
19 pages
Chapter 7 8fhjg
No ratings yet
Chapter 7 8fhjg
9 pages
Lecture Notes 4 Convergence (Chapter 5) 1 Random Samples: 1 N N 1 N N I
No ratings yet
Lecture Notes 4 Convergence (Chapter 5) 1 Random Samples: 1 N N 1 N N I
12 pages
ADASD
No ratings yet
ADASD
4 pages
lecture_note_4
No ratings yet
lecture_note_4
8 pages
Convergence of Random Variables
No ratings yet
Convergence of Random Variables
11 pages
CH7 Prob Supp
No ratings yet
CH7 Prob Supp
5 pages
4 Convergence and Simulation
No ratings yet
4 Convergence and Simulation
55 pages
Math556 11 ModesOfConvergence
No ratings yet
Math556 11 ModesOfConvergence
9 pages
Various Modes of Convergence: Definitions
No ratings yet
Various Modes of Convergence: Definitions
6 pages
Essentials On The Analysis of Randomized Algorithms: 1 Basics
No ratings yet
Essentials On The Analysis of Randomized Algorithms: 1 Basics
8 pages
Week 16 - L13 - Limit Theorems
No ratings yet
Week 16 - L13 - Limit Theorems
22 pages
chap 4
No ratings yet
chap 4
12 pages
Lec 4
No ratings yet
Lec 4
8 pages
Covergence
No ratings yet
Covergence
18 pages
Notes 2
No ratings yet
Notes 2
10 pages
Đồ_án_CSXS (1)
No ratings yet
Đồ_án_CSXS (1)
28 pages
G Chebyshev's
No ratings yet
G Chebyshev's
9 pages
STA 211 Lecture 3
No ratings yet
STA 211 Lecture 3
21 pages
Lesson4 MAT284 PDF
100% (1)
Lesson4 MAT284 PDF
36 pages
Lec 1
No ratings yet
Lec 1
4 pages
Lecture 7: Convergence and Limit Theorems
No ratings yet
Lecture 7: Convergence and Limit Theorems
23 pages
Math5846_chapter6
No ratings yet
Math5846_chapter6
85 pages
Chebysev Inequality: Suppose and Variance
No ratings yet
Chebysev Inequality: Suppose and Variance
13 pages
Data Analysis Slides
No ratings yet
Data Analysis Slides
43 pages
Chebyshev's Inequality:: K K K K
No ratings yet
Chebyshev's Inequality:: K K K K
4 pages
Selective Review - Probability
No ratings yet
Selective Review - Probability
30 pages
SC633 Lecture Notes
No ratings yet
SC633 Lecture Notes
4 pages
Central Limit Theorem
No ratings yet
Central Limit Theorem
7 pages
Week 8
No ratings yet
Week 8
75 pages
Probability II Upload Week 9
No ratings yet
Probability II Upload Week 9
3 pages
Section 6 - The law of large numbers and the central limit theorem(1)
No ratings yet
Section 6 - The law of large numbers and the central limit theorem(1)
10 pages
MIT6 262S11 Lec02
No ratings yet
MIT6 262S11 Lec02
11 pages
STA 241 Topic 14 Laws of Large Numbers(Corr)
No ratings yet
STA 241 Topic 14 Laws of Large Numbers(Corr)
9 pages
Lec5 IntroToProbabilityAndStatistics
No ratings yet
Lec5 IntroToProbabilityAndStatistics
63 pages
Lecture Notes 2 1 Probability Inequalities
No ratings yet
Lecture Notes 2 1 Probability Inequalities
9 pages
STAT8310 Statistical Theory 2021 Topic 5.2 Chebyshev's Inequality and The Central Limit Theorem
No ratings yet
STAT8310 Statistical Theory 2021 Topic 5.2 Chebyshev's Inequality and The Central Limit Theorem
42 pages
hw7 - Sol 2
No ratings yet
hw7 - Sol 2
15 pages
18MAB203T U3 Book PDF
No ratings yet
18MAB203T U3 Book PDF
38 pages
Probability Inequalities: 15.1. Boole's Inequality, Bonferroni Inequalities
No ratings yet
Probability Inequalities: 15.1. Boole's Inequality, Bonferroni Inequalities
14 pages
Asymptotic Statistics (By Changliang ZOU)
No ratings yet
Asymptotic Statistics (By Changliang ZOU)
115 pages
Recitation_1
No ratings yet
Recitation_1
10 pages
Section53
No ratings yet
Section53
35 pages
15-359: Probability and Computing Inequalities: N J N J
No ratings yet
15-359: Probability and Computing Inequalities: N J N J
11 pages
Approximations To Probability Distributions: Limit Theorems
No ratings yet
Approximations To Probability Distributions: Limit Theorems
15 pages
Calculus Volume1
From Everand
Calculus Volume1
Ming Yao Tsai
No ratings yet
A-level Maths Revision: Cheeky Revision Shortcuts
From Everand
A-level Maths Revision: Cheeky Revision Shortcuts
Scool Revision
3.5/5 (8)
Algebraic Equations
From Everand
Algebraic Equations
Demetrios P. Kanoussis
No ratings yet
ALGEBRA SIMPLIFIED EQUATIONS WORKBOOK WITH ANSWERS: Linear Equations, Quadratic Equations, Systems of Equations
From Everand
ALGEBRA SIMPLIFIED EQUATIONS WORKBOOK WITH ANSWERS: Linear Equations, Quadratic Equations, Systems of Equations
Luke Aneke
No ratings yet
BSC CS Sem
No ratings yet
BSC CS Sem
42 pages
Ebooks File Introduction To Probability David F. Anderson All Chapters
100% (7)
Ebooks File Introduction To Probability David F. Anderson All Chapters
62 pages
Theory of Probability Sixth Edition Gnedenko download pdf
No ratings yet
Theory of Probability Sixth Edition Gnedenko download pdf
38 pages
(eBook PDF) Knowing the Odds An Introduction to Probabilitypdf download
100% (3)
(eBook PDF) Knowing the Odds An Introduction to Probabilitypdf download
59 pages
Emanuel Parzen Modern Probability Theory and Its Applications
100% (2)
Emanuel Parzen Modern Probability Theory and Its Applications
480 pages
Ethier Gambling-Course
No ratings yet
Ethier Gambling-Course
42 pages
Sampling Distribution Review
No ratings yet
Sampling Distribution Review
11 pages
(Ebook) An Introduction to Statistics by George Woodbury ISBN 9780534377557, 0534377556 instant download
No ratings yet
(Ebook) An Introduction to Statistics by George Woodbury ISBN 9780534377557, 0534377556 instant download
53 pages
Reading The Candlesticks An OK Estimator For Volatility Jia LI
No ratings yet
Reading The Candlesticks An OK Estimator For Volatility Jia LI
32 pages
(Ebook) Statistical methods in experimental physics by James, Frederick ISBN 9789812567956, 9789812705273, 981256795X, 9812705279 download
100% (1)
(Ebook) Statistical methods in experimental physics by James, Frederick ISBN 9789812567956, 9789812705273, 981256795X, 9812705279 download
56 pages
Instant Access to Probability: Theory and Examples 5th Edition Rick Durrett ebook Full Chapters
100% (1)
Instant Access to Probability: Theory and Examples 5th Edition Rick Durrett ebook Full Chapters
65 pages
KiP AFriendlyIntroductionreprint
No ratings yet
KiP AFriendlyIntroductionreprint
21 pages
Probability-2 Shiryaev
100% (3)
Probability-2 Shiryaev
356 pages
Chap 5
No ratings yet
Chap 5
26 pages
Leon-Garcia-IPPR_Chapters 1-6
No ratings yet
Leon-Garcia-IPPR_Chapters 1-6
180 pages
Experimental and Theoretical
No ratings yet
Experimental and Theoretical
15 pages
Revised Brochure BStat (2016)
No ratings yet
Revised Brochure BStat (2016)
36 pages
BSC Statistics (H) Sem LV Syllabus DU
No ratings yet
BSC Statistics (H) Sem LV Syllabus DU
8 pages
Carnival Game Paper
No ratings yet
Carnival Game Paper
16 pages
Core Statistics PDF
100% (4)
Core Statistics PDF
256 pages
Probability (Schaum's Outline Series, 3rd ed.) 3rd Edition Seymour Lipschutz all chapter instant download
100% (2)
Probability (Schaum's Outline Series, 3rd ed.) 3rd Edition Seymour Lipschutz all chapter instant download
57 pages
Key Answer of Selected Chapter Exercises Ramachandran's Book 3 Ed
No ratings yet
Key Answer of Selected Chapter Exercises Ramachandran's Book 3 Ed
9 pages
2022 Probability-and-Statistics - ICT
No ratings yet
2022 Probability-and-Statistics - ICT
6 pages
Advanced Topics Information Theory-Lecture Notes - Stefan M. Moser 2.5 PDF
No ratings yet
Advanced Topics Information Theory-Lecture Notes - Stefan M. Moser 2.5 PDF
416 pages

확통1 LectureNote06 on Limit Theorems

Uploaded by

확통1 LectureNote06 on Limit Theorems

Uploaded by

Limit Theorems

Limit Theorems: Motivation

• A tool: Several inequalities in probability

• Convergence “in probability”

• Convergence “with probability 1”

Then, 𝑌𝑎 ≤ 𝑋 and 𝐄 𝑌𝑎 ≤ 𝐄[𝑋]. 0 𝑎

On the other hand, 𝐄 𝑌𝑎 = 𝑎𝐏 𝑌𝑎 = 𝑎 = 𝑎𝐏 𝑋 ≥ 𝑎 ,

⚫ If we pick 𝑓(𝑋) judiciously we can obtain better bounds.

𝐄 𝑒 𝑡𝑋 = 𝐄 𝑒 𝑡 σ 𝑋𝑖 = 𝐄 ෑ 𝑒 𝑡𝑋𝑖 = ෑ 𝐄[𝑒 𝑡𝑋𝑖 ]

= ෑ(𝑝𝑖 𝑒 𝑡 + (1 − 𝑝𝑖 ) ∙ 1) = ෑ(1 + 𝑝𝑖 (𝑒 𝑡 − 1)) 8

σ 𝑝𝑖 (𝑒 𝑡 −1) (𝑒 𝑡 −1) σ 𝑝 (𝑒 𝑡 −1)𝜇

⚫ A very similar calculation shows that

• We have a sequence of real numbers 𝑎𝑛 and a number 𝑎

• We say that 𝑎𝑛 converges to 𝑎 and write lim 𝑎𝑛 = 𝑎,

− if (rigorously): For every 𝜖 > 0, there exists

• We have a sequence of random variables 𝑌𝑛

− if (rigorously): For every 𝜖 > 0, we have

• Convergence with probability 1 implies convergence in

lim 𝐏 𝑆𝑚 = lim 𝐏 𝑌𝑛 − 𝑎 < 𝜖, for all 𝑛 ≥ 𝑚 = 1

• Strong Law of Large Numbers (SLLN)

• Can we use it when 𝑛 is “moderate” ?

1. Calculate the mean 𝑛𝜇 and the variance 𝑛𝜎 2 of 𝑆𝑛 .

𝑌 is more than 8 = {𝑌 > 8}

𝑌 is fewer than 8 = {𝑌 < 8}

• Event of interest: |𝑀𝑛 − 𝑝| ≥ 0.01

• Compare to 𝑛 ≥ 50,000 that we derived using the

• 𝑋1 , ⋯ , 𝑋𝑛 are i.i.d. with finite 𝜇 and 𝜎 2

Note) MGF of 𝑁 𝜇, 𝜎 2 is exp(𝜇𝑠 + 𝜎 2 𝑠 2 Τ2) 35

You might also like