0% found this document useful (0 votes)
893 views2 pages

Stats Cheat Sheet

This document provides a cheat sheet on key concepts in applied statistics, including: - Definitions of probability, joint probability, conditional probability, and other probability concepts. - Descriptions of discrete and continuous random variables, their probability mass/density functions, expected values, and variance. - Explanations of statistical distributions like binomial, normal, uniform and t-distributions. - Formulas and concepts for statistical inference like confidence intervals, hypothesis testing, and regression analysis. - Notes on sampling, populations, descriptive statistics like means, medians, variance and how to calculate them. It serves as a one-page reference guide summarizing essential statistical terminology, formulas, and concepts for

Uploaded by

kaungwaiphyo89
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
893 views2 pages

Stats Cheat Sheet

This document provides a cheat sheet on key concepts in applied statistics, including: - Definitions of probability, joint probability, conditional probability, and other probability concepts. - Descriptions of discrete and continuous random variables, their probability mass/density functions, expected values, and variance. - Explanations of statistical distributions like binomial, normal, uniform and t-distributions. - Formulas and concepts for statistical inference like confidence intervals, hypothesis testing, and regression analysis. - Notes on sampling, populations, descriptive statistics like means, medians, variance and how to calculate them. It serves as a one-page reference guide summarizing essential statistical terminology, formulas, and concepts for

Uploaded by

kaungwaiphyo89
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 2

Applied Statistics Cheat Sheet_2017

Probabiltiy There can be more than one independent variables (β2, β3, β4…..) and
those variables can be compared by t-stat
Both… and… = A∩B If t-Stat value (absolute value) less than 2 means “Statistically
Either… or… = A∪B Insignificant”
If A∩B = ∅, it’s Mutually Exclusive Higher than 2 means “Statistically Significant”
If A∪B = 1, it’s Collectively Exhaustive
P(A) = no. of event A / total no. of outcome in sample space Discrete Random Variables

Joint Probability, Marginal Probability Probability Density Function----- P(x) = P(X=x), ΣP(x) = 1
∑𝑛
𝑖=1 𝑊𝑖 𝑋𝑖 𝑊1𝑋1+𝑊2𝑋2+⋯+𝑊𝑛𝑋𝑛
Weighted average---- X̅ = =
∩ B1 B2 𝑛 𝑛
A1 P(A1∩B1) P(A1∩B2) P(A1)
A2 P(A2∩B1) P(A2∩B2) P(A2) Cumulative probability function---- F(x0) = P(X≤x0) = ∑𝑥≤𝑥0 𝑃(𝑋)
P(B1) P(B2) 1 Expected value of a function---- E[g(X)] = ∑𝑥 𝑔(𝑥)𝑃(𝑥) (Similar
concept as average), E(X) = ∑𝑥 𝑥𝑃(𝑥)
Conditional Probability Variance---- σx2 = E(X-μx)2 = ∑𝑥(𝑥 − μx)2 𝑃(𝑥)
P(A∩B)
P(A|B) = → P(A∩B) = P(A|B) P(B) Linear Function of a Random Variable---- mean μy = a+b𝜇𝑥 ,
P(B)
If P(A∩B) = P(A)P(B), they’re Statistically Independent → P(A|B) = variance σx2 = 𝑏 2 σx2 Standard deviation σy = |𝑏| σx
P(A) and
P(B|A) = P(B) 𝑋−𝜇𝑥
Standardization of a Random Variable-- Z = , E(Z)=0, Var(Z)=1
P(A̅) = 1 – P(A) (Complement Rule) 𝜎𝑥

P(A∪B) = P(A)+P(B) – P(A∩B)


P(A∪B) = P(A)+P(B) – P(A) * P(B), if Statistically Independent Bernoulli trial---P(success) = π, P(Failure) = 1- π, Mean = π,
P(A∪B) = P(A)+P(B), if P(A∩B) = 0 Variance = π(1- π)
De Morgan’s law---(𝐴̅̅̅̅̅̅̅ ̅̅̅̅̅̅̅
∪ 𝐵 )’ = A̅ ∩ B̅ and (𝐴 ∩ 𝐵)’ = A̅∪B̅ Combination---- Cxn =
𝑛!
, in calculator: nCr
𝑥!(𝑛−𝑥)!

Population size = N, Population mean = μ Binomial distribution (BD)---- P(x) = Cxn πx(1- π)n-x,
Sample size = n, Sample mean = X̅ Mean of BD πx = E(X) = nπ, Variance of BD σx2 = E(X-μx)2 = nπ(1-π)
“Mean > Median” = positive or right skewed (skewed means going
down so going down to right) Joint probability function-----P(x,y) = P(X=x ∩ Y=y)
“Mean = Median” = symmetric
“Mean < Median” = negative or left skewed Y
𝑛
Mean 𝜇 = ∑𝑖=1( xi − X̅ )2 ∩ y1 y2
Variance x1 P(x1,y1) P(x1,y2) P(x1)
X
Population variance = Sample variance = x2 P(x2,y1) P(x2,y2) P(x2)
δ2 s2 P(y1) P(y2) 1
𝑁 𝑛
∑ ( xi − μ)2 ∑𝑖=1( xi − X̅ )2
𝑖=1 Marginal probability function---- P(x) = Σy P(x,y), P(y) = Σx P(x,y)
𝑁 𝑛−1
Covariance Cov(X,Y) = E[(X-μx)(Y-μy)] = ∑𝑥 ∑𝑦(𝑥 −μx)(y-μy) P(x,y)
Standard deviation (SD) = √𝑉𝑎𝑟𝑖𝑎𝑛𝑐𝑒
or Cov(X,Y) = E(XY)-μxμy = ∑𝑥 ∑𝑦 𝑥𝑦 P(x,y) - μxμy
Coefficient of variation (CV)= SD/mean 𝐶𝑜𝑣(𝑋,𝑌)
Correlation Corr(X,Y) =
Range = x (largest) – x (smallest) σx σy
>>>If two random variables are statistically independent,
Covariance between X and Y (not CV) P(x,y) = P(x)P(y),Cov is 0
Population covariance = δxy or Sample covariance = sxy or
𝑃(𝑥,𝑦)
Cov(x,y) Cov(x,y) Conditional probability function of X, given Y=y, P(x,y) =
𝑁 𝑛 𝑃(𝑦)
∑ 𝑥 −𝜇 ∗ 𝑦 −𝜇 ∑𝑖=1(𝑥𝑖 – x̅ ) ∗ (𝑦𝑖 – y̅ )
𝑖 𝑥 𝑖 𝑦
𝑖=1 𝑛−1 Portfolio Analysis---W= aX+bY, Mean μw = E[aX+bY] = aμx+ bμy
𝑁
Correlation coefficient (CC) Variance σ2𝑤 = a2σ2𝑥 + b2σ2𝑦 + 2ab Cov(X,Y)
Covariance btw X and Y
rxy = (or) σ2𝑤 = a2σ2𝑥 + b2σ2𝑦 + 2ab Corr(X,Y) σx σy
SD of X∗SD of Y

Result is between -1 and 1, where -1 is negatively and perfectly Continuous Random Variables
correlated, 1 is positively and perfectly correlated, and 0 means no
correlation at all. Probability Density Function-----P(X=x) = 0 , Area f(x)0 = 1,
1
f(x) ≥ 0, f(x) =
Regression Analysis 𝑏−𝑎

Important: Y = Dependent (e.g. exam score) … X = Independent (e.g. Cumulative Distribution Function----- F(x) = P(X≤x),
study hour) 𝑏
P(a<X<b) = F(b)-F(a) = ∫𝑎 𝑓(𝑢)𝑑𝑢
Y = β0 + β1 (X)
Applied Statistics Cheat Sheet_2017

𝜎 𝜎
Expectation for continuous random variables E(X) = 𝜇𝑥 𝑍𝛼⁄2 = Margin of Error (ME), 𝑋̅ + 𝑍𝛼⁄2 = Upper Confidence Limit
√𝑛 √𝑛
𝑥 ℎ𝑖𝑔ℎ𝑒𝑠𝑡 𝑥 ℎ𝑖𝑔ℎ𝑒𝑠𝑡 𝜎
=∫𝑥 𝑙𝑜𝑤𝑒𝑠𝑡 𝑥𝑓(𝑥)𝑑𝑥, E[g(X)] = ∫𝑥 𝑙𝑜𝑤𝑒𝑠𝑡 𝑔(𝑥)𝑓(𝑥)𝑑𝑥 (UCL), 𝑋̅ − 𝑍𝛼⁄2 = Upper Confidence Limit (LCL)
√𝑛

𝑥 ℎ𝑖𝑔ℎ𝑒𝑠𝑡
Var(X) = 𝜎𝑥2 = ∫𝑥 𝑙𝑜𝑤𝑒𝑠𝑡 (𝑥 − 𝜇𝑥 )2 𝑓(𝑥)𝑑𝑥 , SD(x) = 𝜎𝑥 = √𝑉𝑎𝑟 (𝑋) (𝝈𝟐 =unknown) t-distribution P(𝑡𝑣 >𝑡𝑣,𝛼 ⁄2 ), 𝛼⁄2 = both tail

𝑎+𝑏 (𝑏−𝑎)2 The confidence interval estimator for the population mean
Uniform Distribution -----mean 𝜇 = , variance 𝜎 2 = 𝑠 𝑠
2 12
𝑋̅ − 𝑡𝑛−1,𝛼⁄2 <𝜇 < 𝑋̅ + 𝑡𝑛−1,𝛼⁄2 , s=sample standard deviation
√𝑛 √𝑛
Linear Functions of Variables---W= a+bX, mean 𝜇𝑤 = a+b𝜇𝑥 ,
Var(w) = 𝜎𝑤2 =𝑏 2 𝜎𝑥2 , SD(w) = 𝜎𝑤 = |b|𝜎𝑥 Hypothesis (population distribution is normal with unknown mean
and variance.)
𝑋−𝜇𝑥
Standardized random variable---- mean = 0, variance = 1, Z=
𝜎𝑥 The null hypothesis 𝐻0 : 𝜇 = 8%
1 2 /2𝜎 2
Normal Distribution ----- f(x) = 𝑒 −(𝑥−𝜇) The alternative hypothesis 𝐻1 : 𝜇 ≠ 8%
2√𝜋𝜎 2
(e=2.71828, 𝜋 = 3.14159, X~N(𝜇,𝜎 2 )
𝑋̅−𝜇0
Excel Function--- NORN.DIST(x, 𝜇,𝜎, FALSE) (𝝈𝟐 =known) Z-test--- Z= , one sided test = 𝑍𝛼 (𝐻0 : 𝜇 = 𝜇0 , 𝐻1 : 𝜇
𝜎/√𝑛
>>> if standard deviation is greater, the distribution is flat ≠ 𝜇0 )

Cumulative Distribution function for Normal Distribution two sided test = 𝑍𝛼⁄2 (𝐻0 : 𝜇 = 𝜇0 , 𝐻1 : 𝜇 < 𝜇0 or 𝜇 > 𝜇0 )
F(𝑥0 )=P(X≤ 𝑥0 ), P(a<X<b) = F(b)-F(a)
𝑋̅− 𝜇0
(𝝈𝟐 =unknown) t-statistic--- t= , one sided test = 𝑡𝑛−1,𝛼 , two sided
𝑠/√𝑛
Standard Normal Distribution Z~N(0,1) ,
f(Z) for PDF, F(Z) for CDF test =𝑡𝑛−1,𝛼⁄2
F(-2) = P(Z<-2) = 1-P(Z<2), P(Z>-2.25) = P(Z<2.25)
Reject 𝐻0 if the absolute value of t-statistic is greater than 𝑡𝑛−1,𝛼⁄2 ,
𝑋−𝜇 otherwise accept
Transform any normal distribution to N (0,1)---- Z=
𝜎
X=𝜇+Z𝜎 Testing the difference in the population means between two different
𝑎−𝜇 𝑋−𝜇 𝑏−𝜇 𝑏−𝜇 𝑎−𝜇
P(a<X<b) = P( < < ) = F( ) – F( ) samples
𝜎 𝜎 𝜎 𝜎 𝜎
(distributions for both groups are normal, population mean different but
Covariance Cov(X,Y) = E[(X-𝜇𝑥 )-(Y-𝜇𝑦 ) or E(XY)-𝜇𝑥 𝜇𝑦 , population 𝜎 2 same.)
>>>if both X and Y conform to normal distribution, then we can say: one sided test = 𝐻0 : 𝜇𝑥 − 𝜇𝑦 = 0 , 𝐻1 : 𝜇𝑥 − 𝜇𝑦 > 0,
if Cov(X,Y) = 0, then X,Y are independent. two sided test = 𝐻0 : 𝜇𝑥 − 𝜇𝑦 = 0 , 𝐻1 : 𝜇𝑥 − 𝜇𝑦 ≠ 0
(𝑛𝑥 −1)𝑠𝑥2 +(𝑛𝑥 −1)𝑠𝑝𝑦
2
(𝑋̅−𝑌̅)− 0
𝐶𝑜𝑣(𝑋,𝑌)
pooled sample variance 𝑠𝑝2 = , t-stat --- t =
𝑛𝑥 +𝑛𝑦 −2 𝑠2 2
Correlation 𝜌 = Corr (X,Y) = √ 𝑝 𝑠𝑝
+
𝜎𝑥 𝜎𝑦 𝑛𝑥 𝑛𝑦

Portfolio Analysis II-----E(aX+bY) = a𝜇𝑥 + b𝜇𝑦 cutoff point 𝑡𝑛𝑥 +𝑛𝑦−2,𝛼


Var(aX+bY) = 𝑎2 𝜎𝑥2 +𝑏 2 𝜎𝑦2 +2abCov(X,Y) =
Reject the null hypothesis 𝐻0 if the t-statistic is greater than 𝑡𝑛𝑥 +𝑛𝑦−2,𝛼
𝑎2 𝜎𝑥2 +𝑏 2 𝜎𝑦2 +2abCorr(X,Y)𝜎𝑥 𝜎𝑦

Sampling Distribution

E(𝑋̅) = 𝜇 , 𝜎 2 𝑥̅ = 𝜎 2 /n (𝜎 2 is population variance), 𝜎 2 > 𝜎 2 𝑥̅


>>> 𝜎 2 𝑥̅ decrease as the sample size increases.,
Random variable X can be any distribution

Bernoulli distribution--- success probability 𝜋 = 𝑝̅

The distribution of the normal sample mean--- 𝑋̅~N(𝜇, 𝜎 2 /𝑛)

(𝝈𝟐 =known) Significance level 𝛼 (Confidence level) = 100*(1-𝛼 ),


P(Z>𝑍𝛼⁄2 ), 𝛼⁄2 = both tail

The confidence interval estimator for the population mean


𝜎 𝜎
𝑋̅ − 𝑍𝛼⁄2 <𝜇 < 𝑋̅ + 𝑍𝛼⁄2
√𝑛 √𝑛

You might also like