Summary MAS291
Summary MAS291
𝑃(𝐵 | 𝐴) . 𝑃(𝐴)
● P(A | B) =
𝑃(𝐵)
● If A, B independent: P(A ⋂ B) = P(A) . P(B)
● ℳ = E(x) = ∑ xi . P(x=xi)
= ∑ xi2 . P(x=xi) - ℳ2
● E(ax + by) = a.E(x) + b.E(y)
● V(ax + by) = a2 . V(x) + b2 . V(y)
● Probability mass function: f(xi) = P(x=xi)
● Cumulative distribution function: F(xi) = P(x≤xi)
● Some special distribution:
1. Discrete uniform distribution
1
○ P(x=Xi) =
𝑛
𝑎+𝑏
○ ℳ= 2
2
2 (𝑏 − 𝑎 + 1) − 1
○ σ =
12
2. Binomial distribution
○ P(x=k) = nCk . pk . (1-p)n-k
○ ℳ = n.p
○ σ2 = n.p . (1-p)
3. Poisson distribution
−λ.𝑇
𝑒
○ P(x=k) = (λ.T)k
𝑘!
○ ℳ = λ.T
○ σ2 = λ.T
4. Hypergeometric distribution
𝐾𝐶𝑘 . (𝑁−𝐾)𝐶(𝑛−𝑘)
○ P(x=k) =
𝑁𝐶𝑛
○ ℳ = n.p
𝑁−𝑛
○ σ2 = n.p.(1-p). 𝑁 − 1
5. Geometric distribution
○ P(x=k) = (1-p)k-1 . p
1
○ ℳ= 𝑝
1−𝑝
○ σ2 = 2
𝑝
6. Negative binomial distribution
○ P(x=k) = (k-1)C(r-1) . pr . (1-p)k-r
𝑟
○ ℳ= 𝑝
𝑟 . (1 − 𝑝)
○ σ2 = 2
𝑝
+∞
𝑛
● E(xn) = ∫ 𝑥 . 𝑓(𝑥) 𝑑𝑥
−∞
+∞
2 2 2
● σ = V(x) = ∫ 𝑥 . 𝑓(𝑥) 𝑑𝑥 - ℳ
−∞
𝑥−ℳ
○ z=
σ
2
𝑥
1 2
○ f(z) = . 𝑒
2Π
○ ϕ(x) = p(z<xi)
○ ϕ(-x) = 1 - ϕ(x)
3. Normal distribution approximate binomial and poisson distribution
a. Binomial (np > 5 and n(1-p) > 5)
𝑥 − 𝑛.𝑝
■ z=
𝑛.𝑝.(1−𝑝)
■ P(XBINORM ≤ a) = P(XNORMAL ≤ a+0.5)
■ P(XBINORM ≧ a) = P(XNORMAL ≧ a-0.5)
b. Poisson
𝑥−λ
■ z=
λ
■ P(XPOISSON ≤ a) = P(XNORMAL ≤ a+0.5)
■ P(XPOISSON ≧ a) = P(XNORMAL ≧ a-0.5)
4. Exponential distribution
−λ.𝑇
○ f(x) = λ . 𝑒 , x>0
○ = 0 , elsewhere
−λ.𝑎
○ P(x ≧ a) = 𝑒 ,(a > 0)
1
○ ℳ= λ
1
○ σ2 = 2
λ
𝑥𝑐𝑒𝑖𝑙(𝐿 ) + 𝑥𝑓𝑙𝑜𝑜𝑟(𝐿 )
𝑛+1 2 2
○ L2 = so Q2 =
2 2
𝑥𝑐𝑒𝑖𝑙(𝐿 ) + 𝑥𝑓𝑙𝑜𝑜𝑟(𝐿 )
3.(𝑛 + 1) 3 3
○ L3 = so Q3 =
4 2
V. Sampling distribution
● Population mean ℳ, variance σ2. Sample size n. (Normal distribution or n > 30):
2
σ
○ Phân phối của 𝑋 có dạng: N(ℳ , )
𝑛
2 2
σ1 σ2
○ Phân phối của 𝑋 - 𝑋 có dạng: N(ℳ1 - ℳ2 , 𝑛 + 𝑛 )
1 2
1 2
● (l, u) = (𝑋 - E, 𝑋 + E)
● width = 2E
● P-value = 2 . P(Z > |Z0|)
1. Population variance known
σ
○ E=𝑧
α/2
.
𝑛
𝑋−ℳ
○ 𝑧0 =
σ/ 𝑛
2. Population variance unknown
○ n > 30:
𝑆
■ E=𝑧
α/2
.
𝑛
𝑋−ℳ
■ 𝑧0 =
𝑆/ 𝑛
○ n ≤ 30:
𝑆
■ E=𝑡
α/2, 𝑛−1
.
𝑛
𝑋−ℳ
■ 𝑡0 =
𝑆/ 𝑛
● For propotion:
○ (l, u) = (𝑃 - E, 𝑃 + E)
𝑃.(1 − 𝑃)
○ E=𝑧 .
α/2 𝑛
𝑃−𝑃
○ 𝑧0 =
𝑃.(1 − 𝑃)
𝑛
VII. Test claims for 2 samples (2 population independent, normal distribution or both n1, n2 > 30)
● (l, u) = (𝑋 - 𝑋 - E , 𝑋 - 𝑋 + E)
1 2 1 2
2 2
σ1 σ2
○ E=𝑧
α/2
.
𝑛1
+ 𝑛2
𝑋1 − 𝑋2 − ∆0
○ 𝑧0 =
2 2
σ1 σ2
𝑛1
+ 𝑛2
■ Degree of freedom: df = 𝑛 + 𝑛 + 2
1 1
2 2
(𝑛1 − 1) . 𝑆1 + (𝑛2 − 1) . 𝑆2
2
■ 𝑆𝑝 =
𝑛1+ 𝑛2 − 2
2 2
𝑆𝑝 𝑆𝑝
■ E=𝑡
α/2, 𝑑𝑓
.
𝑛1
+ 𝑛2
𝑋1 − 𝑋2 − ∆0
■ 𝑡0 =
2 2
𝑆𝑝 𝑆𝑝
𝑛1
+ 𝑛2
2 2
○ Not assume σ = σ
1 2
2
( )
2 2
𝑆1 𝑆2
𝑛1
+ 𝑛2
■ Degree of freedom: df = 4 4
𝑆1 𝑆2
2 + 2
𝑛1 . (𝑛1 − 1) 𝑛2 . (𝑛2 − 1)
2 2
𝑆1 𝑆2
■ E=𝑡
α/2, 𝑑𝑓
.
𝑛1
+ 𝑛2
𝑋1 − 𝑋2 − ∆0
■ 𝑡0 =
2 2
𝑆1 𝑆2
𝑛1
+ 𝑛2
● For propotion:
○ (l, u) = (𝑃 - 𝑃 - E , 𝑃 - 𝑃 + E)
1 2 1 2
𝑃1 . (1 − 𝑃1) 𝑃2 . (1 − 𝑃2)
○ E=𝑧
α/2
.
𝑛1
+ 𝑛2
𝑥1 + 𝑥2
○ 𝑃= (trong đó xi = n . 𝑃 )
𝑛1 + 𝑛2 𝑖
𝑃1 − 𝑃2 − ∆0
○ 𝑧0 =
𝑃 . (1 − 𝑃) . ( 1
𝑛1
+
1
𝑛2 )
VIII. Linear Regression
2
( ) = ∑𝑥 2 2
● SXX = ∑ 𝑥 − 𝑥 − 𝑛.𝑥
𝑖 𝑖
2
= ∑(𝑦 − 𝑦) = ∑𝑦
2 2
● SYY − 𝑛.𝑦
𝑖 𝑖
𝑆𝑋𝑌 ∑𝑥𝑖𝑦𝑖 − 𝑛 . 𝑥 . 𝑦
● Slope: β = = 2
1 𝑆𝑋𝑋 2
∑𝑥𝑖 − 𝑛 . 𝑥
● Intercept: β = 𝑦 - β . 𝑥
0 1
2
● Error sum of square: SSE = ∑ 𝑦𝑖 − 𝑦𝑖 ( )
2
● Regression sum of square: SSR = ∑ 𝑦𝑖 − 𝑦 ( )
2
● Total sum of square: SST = ∑ 𝑦𝑖 − 𝑦 ( )
● SSE + SSR = SST
𝑆𝑆𝐸
● Standard error: σ =
𝑛−2
𝑆𝑆𝑅 𝑆𝑋𝑌
● Coefficient of correlation: R = =
𝑆𝑆𝑇 𝑆𝑋𝑋 . 𝑆𝑌𝑌
2
σ
○ se(β ) =
1 𝑆𝑋𝑋
β1 − β1,0
○ 𝑡0 =
𝑠𝑒(β1)
● Test claims about the intercept (df = n-2):
( )
2 2
1 𝑥
○ se(β ) =
0
σ . 𝑛
+ 𝑆𝑋𝑋
β0 − β0,0
○ 𝑡0 =
𝑠𝑒(β0)
𝑅−0
● Test claims about the coefficient of correlation (df = n-2): 𝑡 =
0 2
1−𝑅
𝑛−2