0% found this document useful (0 votes)
588 views

Summary MAS291

I. The document outlines basic probability formulas for unions, intersections, conditional probability, and independent events. II. It defines concepts related to discrete random variables including expected value, variance, probability mass function, cumulative distribution function, and several special distributions like binomial, Poisson, and geometric. III. Continuous random variables are discussed in terms of probability density function, cumulative distribution function, expected value, variance, and some common distributions like normal, uniform, and exponential. IV. Descriptive statistics like sample mean, median, mode, range, and quartiles are defined for samples from a population. V. The sampling distributions of sample means and proportions are described. VI. Statistical intervals and hypothesis

Uploaded by

Hiếu Phạm
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
588 views

Summary MAS291

I. The document outlines basic probability formulas for unions, intersections, conditional probability, and independent events. II. It defines concepts related to discrete random variables including expected value, variance, probability mass function, cumulative distribution function, and several special distributions like binomial, Poisson, and geometric. III. Continuous random variables are discussed in terms of probability density function, cumulative distribution function, expected value, variance, and some common distributions like normal, uniform, and exponential. IV. Descriptive statistics like sample mean, median, mode, range, and quartiles are defined for samples from a population. V. The sampling distributions of sample means and proportions are described. VI. Statistical intervals and hypothesis

Uploaded by

Hiếu Phạm
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 9

I.

Basic probability formulas


● P(A ⋃ B) = P(A) + P(B) - P(A ⋂ B)
𝑃(𝐴 ⋂ 𝐵)
● P(A | B) =
𝑃(𝐵)

𝑃(𝐵 | 𝐴) . 𝑃(𝐴)
● P(A | B) =
𝑃(𝐵)
● If A, B independent: P(A ⋂ B) = P(A) . P(B)

II. Discrete random variables

● ℳ = E(x) = ∑ xi . P(x=xi)

● σ2 = V(x) = ∑ (xi - ℳ)2 . P(x=xi)

= ∑ xi2 . P(x=xi) - ℳ2
● E(ax + by) = a.E(x) + b.E(y)
● V(ax + by) = a2 . V(x) + b2 . V(y)
● Probability mass function: f(xi) = P(x=xi)
● Cumulative distribution function: F(xi) = P(x≤xi)
● Some special distribution:
1. Discrete uniform distribution
1
○ P(x=Xi) =
𝑛

𝑎+𝑏
○ ℳ= 2
2
2 (𝑏 − 𝑎 + 1) − 1
○ σ =
12
2. Binomial distribution
○ P(x=k) = nCk . pk . (1-p)n-k

○ ℳ = n.p
○ σ2 = n.p . (1-p)
3. Poisson distribution
−λ.𝑇
𝑒
○ P(x=k) = (λ.T)k
𝑘!

○ ℳ = λ.T
○ σ2 = λ.T
4. Hypergeometric distribution
𝐾𝐶𝑘 . (𝑁−𝐾)𝐶(𝑛−𝑘)
○ P(x=k) =
𝑁𝐶𝑛

○ ℳ = n.p
𝑁−𝑛
○ σ2 = n.p.(1-p). 𝑁 − 1

5. Geometric distribution
○ P(x=k) = (1-p)k-1 . p
1
○ ℳ= 𝑝

1−𝑝
○ σ2 = 2
𝑝
6. Negative binomial distribution
○ P(x=k) = (k-1)C(r-1) . pr . (1-p)k-r
𝑟
○ ℳ= 𝑝

𝑟 . (1 − 𝑝)
○ σ2 = 2
𝑝

III. Continuous random variable


𝑏
● Probability density function f(x): P(a<x<b) = ∫f(x) dx
𝑎

● Cumulative distribution function F(x):


○ F(xi) = P(x≤xi)
○ F(xi)’ = f(x i)
+∞
● ℳ = E(x) = ∫ 𝑥. 𝑓(𝑥) 𝑑𝑥
−∞

+∞
𝑛
● E(xn) = ∫ 𝑥 . 𝑓(𝑥) 𝑑𝑥
−∞

+∞
2 2 2
● σ = V(x) = ∫ 𝑥 . 𝑓(𝑥) 𝑑𝑥 - ℳ
−∞

● Some special distribution:


1. Continuous uniform distribution
1
○ f(x) = , a≤x≤b
𝑏−𝑎
= 0 , elsewhere
𝑎+𝑏
○ ℳ= 2
2
2 (𝑏 − 𝑎)
○ σ =
12

2. Normal distribution N(ℳ, σ2)

𝑥−ℳ
○ z=
σ
2
𝑥
1 2
○ f(z) = . 𝑒

○ ϕ(x) = p(z<xi)
○ ϕ(-x) = 1 - ϕ(x)
3. Normal distribution approximate binomial and poisson distribution
a. Binomial (np > 5 and n(1-p) > 5)
𝑥 − 𝑛.𝑝
■ z=
𝑛.𝑝.(1−𝑝)
■ P(XBINORM ≤ a) = P(XNORMAL ≤ a+0.5)
■ P(XBINORM ≧ a) = P(XNORMAL ≧ a-0.5)
b. Poisson
𝑥−λ
■ z=
λ
■ P(XPOISSON ≤ a) = P(XNORMAL ≤ a+0.5)
■ P(XPOISSON ≧ a) = P(XNORMAL ≧ a-0.5)

4. Exponential distribution
−λ.𝑇
○ f(x) = λ . 𝑒 , x>0

○ = 0 , elsewhere
−λ.𝑎
○ P(x ≧ a) = 𝑒 ,(a > 0)
1
○ ℳ= λ

1
○ σ2 = 2
λ

IV. Descriptive statistic (Take a sample of size n from population N)


∑ 𝑥𝑖
● Sample mean: 𝑥 =
𝑛

𝑛+1 𝑥𝑐𝑒𝑖𝑙(𝐿) + 𝑥𝑓𝑙𝑜𝑜𝑟(𝐿)


● Sample median: L = so Median =
2 2
● Mode: Số phần tử xuất hiện nhiều nhất
● Range: max - min
2
∑ ( 𝑥 − 𝑥𝑖)
● Sample variance: s2 =
𝑛−1
● Quatiles:
𝑥𝑐𝑒𝑖𝑙(𝐿 ) + 𝑥𝑓𝑙𝑜𝑜𝑟(𝐿 )
𝑛+1 1 1
○ L1 = so Q1 =
4 2

𝑥𝑐𝑒𝑖𝑙(𝐿 ) + 𝑥𝑓𝑙𝑜𝑜𝑟(𝐿 )
𝑛+1 2 2
○ L2 = so Q2 =
2 2

𝑥𝑐𝑒𝑖𝑙(𝐿 ) + 𝑥𝑓𝑙𝑜𝑜𝑟(𝐿 )
3.(𝑛 + 1) 3 3
○ L3 = so Q3 =
4 2

V. Sampling distribution

● Population mean ℳ, variance σ2. Sample size n. (Normal distribution or n > 30):
2
σ
○ Phân phối của 𝑋 có dạng: N(ℳ , )
𝑛
2 2
σ1 σ2
○ Phân phối của 𝑋 - 𝑋 có dạng: N(ℳ1 - ℳ2 , 𝑛 + 𝑛 )
1 2
1 2

● For proportion of population p, sample size n. (np ≧ 5 or n.(1-p) ≧ 5):


𝑃.(1 − 𝑃)
○ Phân phối của 𝑃 có dạng: N(𝑃 , )
𝑛

𝑃1.(1 − 𝑃1) 𝑃2.(1 − 𝑃2)


○ Phân phối của 𝑃 - 𝑃 có dạng: N(𝑃 - 𝑃 , + )
1 2 1 2 𝑛1 𝑛2

VI. Statistical intervals - Test claims for one sample

● (l, u) = (𝑋 - E, 𝑋 + E)
● width = 2E
● P-value = 2 . P(Z > |Z0|)
1. Population variance known
σ
○ E=𝑧
α/2
.
𝑛

𝑋−ℳ
○ 𝑧0 =
σ/ 𝑛
2. Population variance unknown
○ n > 30:
𝑆
■ E=𝑧
α/2
.
𝑛

𝑋−ℳ
■ 𝑧0 =
𝑆/ 𝑛
○ n ≤ 30:
𝑆
■ E=𝑡
α/2, 𝑛−1
.
𝑛

𝑋−ℳ
■ 𝑡0 =
𝑆/ 𝑛
● For propotion:

○ (l, u) = (𝑃 - E, 𝑃 + E)

𝑃.(1 − 𝑃)
○ E=𝑧 .
α/2 𝑛

𝑃−𝑃
○ 𝑧0 =
𝑃.(1 − 𝑃)
𝑛

○ Nếu đề không cho 𝑃, mặc định 𝑃 = 0.5


● Nếu là one-side thì tương tự nhưng thay α/2 thành α

VII. Test claims for 2 samples (2 population independent, normal distribution or both n1, n2 > 30)

● (l, u) = (𝑋 - 𝑋 - E , 𝑋 - 𝑋 + E)
1 2 1 2

1. Population variance known

2 2
σ1 σ2
○ E=𝑧
α/2
.
𝑛1
+ 𝑛2
𝑋1 − 𝑋2 − ∆0
○ 𝑧0 =
2 2
σ1 σ2
𝑛1
+ 𝑛2

2. Population variance unknown


2
○ Assume σ
1
= σ22

■ Degree of freedom: df = 𝑛 + 𝑛 + 2
1 1
2 2
(𝑛1 − 1) . 𝑆1 + (𝑛2 − 1) . 𝑆2
2
■ 𝑆𝑝 =
𝑛1+ 𝑛2 − 2

2 2
𝑆𝑝 𝑆𝑝
■ E=𝑡
α/2, 𝑑𝑓
.
𝑛1
+ 𝑛2

𝑋1 − 𝑋2 − ∆0
■ 𝑡0 =
2 2
𝑆𝑝 𝑆𝑝
𝑛1
+ 𝑛2

2 2
○ Not assume σ = σ
1 2
2

( )
2 2
𝑆1 𝑆2
𝑛1
+ 𝑛2
■ Degree of freedom: df = 4 4
𝑆1 𝑆2
2 + 2
𝑛1 . (𝑛1 − 1) 𝑛2 . (𝑛2 − 1)

2 2
𝑆1 𝑆2
■ E=𝑡
α/2, 𝑑𝑓
.
𝑛1
+ 𝑛2

𝑋1 − 𝑋2 − ∆0
■ 𝑡0 =
2 2
𝑆1 𝑆2
𝑛1
+ 𝑛2

● For propotion:

○ (l, u) = (𝑃 - 𝑃 - E , 𝑃 - 𝑃 + E)
1 2 1 2

𝑃1 . (1 − 𝑃1) 𝑃2 . (1 − 𝑃2)
○ E=𝑧
α/2
.
𝑛1
+ 𝑛2
𝑥1 + 𝑥2
○ 𝑃= (trong đó xi = n . 𝑃 )
𝑛1 + 𝑛2 𝑖

𝑃1 − 𝑃2 − ∆0
○ 𝑧0 =
𝑃 . (1 − 𝑃) . ( 1
𝑛1
+
1
𝑛2 )
VIII. Linear Regression

● SXY = ∑(𝑥𝑖 − 𝑥)(𝑦𝑖 − 𝑦) = ∑𝑥𝑖𝑦𝑖 − 𝑛. 𝑥. 𝑦

2
( ) = ∑𝑥 2 2
● SXX = ∑ 𝑥 − 𝑥 − 𝑛.𝑥
𝑖 𝑖

2
= ∑(𝑦 − 𝑦) = ∑𝑦
2 2
● SYY − 𝑛.𝑦
𝑖 𝑖

𝑆𝑋𝑌 ∑𝑥𝑖𝑦𝑖 − 𝑛 . 𝑥 . 𝑦
● Slope: β = = 2
1 𝑆𝑋𝑋 2
∑𝑥𝑖 − 𝑛 . 𝑥

● Intercept: β = 𝑦 - β . 𝑥
0 1
2
● Error sum of square: SSE = ∑ 𝑦𝑖 − 𝑦𝑖 ( )
2
● Regression sum of square: SSR = ∑ 𝑦𝑖 − 𝑦 ( )
2
● Total sum of square: SST = ∑ 𝑦𝑖 − 𝑦 ( )
● SSE + SSR = SST

𝑆𝑆𝐸
● Standard error: σ =
𝑛−2

𝑆𝑆𝑅 𝑆𝑋𝑌
● Coefficient of correlation: R = =
𝑆𝑆𝑇 𝑆𝑋𝑋 . 𝑆𝑌𝑌

● Test claims about the slope (df = n-2):

2
σ
○ se(β ) =
1 𝑆𝑋𝑋

β1 − β1,0
○ 𝑡0 =
𝑠𝑒(β1)
● Test claims about the intercept (df = n-2):

( )
2 2
1 𝑥
○ se(β ) =
0
σ . 𝑛
+ 𝑆𝑋𝑋

β0 − β0,0
○ 𝑡0 =
𝑠𝑒(β0)

𝑅−0
● Test claims about the coefficient of correlation (df = n-2): 𝑡 =
0 2
1−𝑅
𝑛−2

You might also like