0% found this document useful (0 votes)

7 views

ST Formula Sheet Midterm

The document is a midterm formula sheet covering statistical methods, sampling techniques, types of data, key measures, data display, relationships between variables, probability, expected value, and distributions. It outlines various statistical concepts such as descriptive and inferential statistics, measures of central tendency, variability, and probability rules. Additionally, it includes formulas for binomial and Poisson distributions, as well as methods for analyzing data relationships and distributions.

Uploaded by

meklerlevi

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

7 views

ST Formula Sheet Midterm

Uploaded by

meklerlevi

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 4

summaryking.escp.

b1 Midterm Formula Sheet Page 1

Statistical approach
Statistical methods Sampling techniques Types of data
Random sampling:
Selection from the population in such Qualitative Quantitative
Descriptive Inferential way that every different sample of the
Description of the Using data from a (Categorical) Numerical measures
same size has an equal chance of Description of attributes or counts
properties of the sample to make selection
sample data forecasts about a
larger group Systematic sampling:
Selection of every kth experimental unit Nominal Ordinal Discrete Continuous
from a list of all experimental units No order Order on a scale Integers Decimals
Ways of obtaining data (e. g. hair colour) (e. g. ranking) (e. g. number of people) (e. g. speed)
§ Published source Stratified sampling:
§ Designed experiment Identification of subgroups, selection of a Identifier variable = Categorical variable with the special variable
§ Survey random sample within each subgroup that there is only one case in each category (e. g. ID number)
§ Observational study and putting them together
Cluster sampling:
Steps in a statistical study Division of a population into clusters and
random selection of some of these
1. Identify goals.
clusters
2. Draw a sample from a population.
3. Collect raw data and summarise. Convenience sampling:
4. Make inferences about population. Selection of experimental units that
5. Draw conclusions. convenient to reach

Key measures

Example Mean Median Mode

data set: = Average of a data set = Middle of a data set = Most common value in a data set
Grades on ∑$GH! $3 $1 + $2 + ⋯ + $4 $7!
(Not affected by extreme values)
x̄ = = Position of median in a sequence =
a test: F
4 4 Example: Mo = 11
3, 5, 7, 7, 8, (Not affected by extreme values)
8, 9, 10, 11, (Affected by extreme values)
'7 & '
Example: Position of Md = = 9.5 (between 9th and 10th
11, 11, 12, 0 & 3 & 4 & 4 & ... & '3 & '6 & '7 & '7
)
Example: x̄ = = 10.83 value) ® Md = 11
12, 14, 15, '7

16, 18, 18
Range Variance Standard deviation
= Difference between the largest and the = Degree of dispersion = Degree of dispersion
smallest value ∑!
89.("G * %̄ )^F
s2 = = ∑!
89.("G * %̄ )^F
Range = xlargest – xsmallest $*! s = √8^2 = ) =
("! * %̄ ): 7 ("F * %̄ ): 7 … 7 ("$ * %̄ ): $*!

Example: R = 18 – 3 = 15 $! ("! %̄ ): 7 ("F * %̄ ): 7 … 7 ("$ * %̄ ):

)
$*!
(Denominator is n – 1 when using a sample
and n when using a population) Example: s = √17.91 = 4.23
(0 ( ';.70)! & (3 ( ';.70)! & … & ('7 ( ';.70)!
Example: s2 = = 17.91
'7 ( '

Data display

Qualitative data

Table of counts Pie chart Bar graph Pareto diagram

(Bars arranged in descending order)

Quantitative data

Stem-and-leaf plot Dot plot Absolute frequency histogram Relative frequency histogram
Data: 21, 24, 24, 26, 27, 27, 30, 32, 38, 41
2 144677
3 028
4 1

(Absolute number of a specific event) (Proportion of a specific event within the

total number)
summaryking.escp.b1 Midterm Formula Sheet Page 2

Examining a distribution
Mode Skewness Unusual features Variation
K
§ Outlier Coefficient of variation: CV =
%̄
§ Cluster
§ Gap Interpretation:
CV < 1 ® Low variability
Whenever one of those is CV > 1 ® High variability
present, it is better to use the
median instead of the mean. The higher the CV, the greater the
Relative standing level of dispersion around the mean.
" * %̄
Z-score: z =
L Empirical Rule Chebyshev’s Rule
Interpretation:
z > 0 → Data value is above the mean
z < 0 → Data value is below the mean
z is close to 0 → Data value is not unusual
|z| > 2 → Data value is unusual
|z| > 3 → Data value is very unusual
pth percentile = Number such that p% of the
data fall below it
Box plot:
(Any distribution)
(Only normal (bell-shaped) distributions)
25% 25%
25% 25%
Interquartile range

Relationship between quantitative variables

Scatterplot Regression line (Line of best fit) Correlation conditions
Equation: with 1. Both variables must be quantitative.
2. The shape of the scatterplot must
Y be a straight linear line.
Response
/ Predicted 3. Outliers do not distort the results.
/ Dependent
variable
Correlation coefficient r

with
X
Each case in the data Explanatory
Interpretation:
set is assigned to a / Predictor
/ Independent
dot of the form (Xi,Yi). variable Interpolation: Using values within the domain
Extrapolation: Using values outside the domain
Elements to describe a scatterplot:

Direction Shape Strength Residual

Residual scatterplot
Positive slope: As X Linear Strong relationship Residual = Observed – Predicted = y –
a If a regression model was well
increases, Y increases. Interpretation: done, the obtained residual
Negative slope: As X Negative residual: Overestimate scatterplot (plotting residuals
Moderate relationship Positive residual: Underestimate against x-values or predicted
increases, Y decreases. Curve (→ Straighten values) stretches horizontally, has
with transformation) no bends, no or very few outliers
Unusual features No relationship Coefficient of determination and an approximate bell shape.
Outliers r2 = Percentage of the variation in Y which
Standard deviation of residuals se:
Level of variation of the y-values
Clusters / Subgroups has been accounted for by the model around the fitted line

Relationship between categorical variables

Contingency table Independency

(Absolute) Count Two events A and B are

Relative frequency
independent if:
P(A∩B) = P(A) · P(B)

Simpson’s Paradox
Statistical situation in which a
trend or relationship that is
observed between two
variables within multiple
groups disappears when the
Chi-squared statistic
Cramer’s V groups are combined
Difference between the observed counts and the counts that would according to a third variable
be expected if there were no relationship between the variables at all (lurking variable)

with X2 = 0 ® Total independence

(never in real life) The higher v, the stronger the association between the variables.
summaryking.escp.b1 Midterm Formula Sheet Page 3

Probability
Fundamental conditions Disjoint Independent
1. For any event A: 0 ≤ P(A) ≤ 1 Events that have no outcomes in common If the outcome of one event does not influence the outcome of
2. P(S) = 1 (with S representing are called disjoint or mutually exclusive. another event, those events are independent.
the set of all possible
outcomes) Independent if:
Complement P(A∩B) = P(A) · P(B)
The set of outcomes that are not
Disjoint Potentially independent
in the event A is called the
complement of A, denoted AC.
Law of large numbers
Calculation rules As a random trial is repeated over and over again, the proportion of
Complement Rule: P(AC) = 1 – P(A) times that an event occurs gets closer and closer to a single value
(empirical probability).
Addition Rule (for disjoint events): P(A∪B) = P(A or B) = P(A) + P(B)
MN,0OP :E QG,OL 2 :RRNPL
Multiplication Rule (for disjoint events): P(A∩B) = P(A and B) = P(A) · P(B) Empirical probability (in the long run): P(A) =
MN,0OP :E QPG/9L

General Addition Rule: P(A∪B) = P(A or B) = P(A) + P(B) – P(A∩B) Requirements:

1. The probability for each event remains the same for each trial.
General Multiplication Rule: P(A∩B) = P(A and B) = P(A) · P(B|A) = P(B) · P(A|B)
2. The outcome of a trial is not influenced by the outcome of
previous trials.
Circle notation

Conditional probability
P(B|A) is the probability of event B occurring, given that event A
occurs.
S(2∩3)
P(B|A) = P(B given A) = PAB =
S(2)

P(A) = Whole left circle P(A) = Left circle without intersection Sampling without replacement (drawn individual does not return to the pool) is an
instance of working with conditional probability. When dealing with a large population,
P(B) = Whole right circle P(B) = Right circle without intersection
sampling without replacement does not really matter. However, in a small population,
P(A∪B) = P(A) + P(B) – P(A∩B) P(A∪B) = P(A) + P(B) + P(A∩B) probabilities need to be adjusted accordingly.

Tree diagram Bayes’ Rule

All final outcomes are disjoint, and their probabilities must P(A|B) =
S(3|2) · S(2)

add up to 1. S(3)

Example:
To calculate the probability of a final outcome, all
Given: P(Cancer) = 0.05, P(Smoker) = 0.10, P(Smoker|Cancer) = 0.20
probabilities of the branches leading towards that
;.); · ;.;3
outcome are multiplied together. ® P(Cancer|Smoker) = = 0.1
;.';

Expected value Measures of variability

Value that is most likely the result of the next repeated trial of a statistical experiment Variance: = 2 = Var(X) = ∑$GH!(xi - ;)2 · P(X = xi)
General formula: ; = E(x) = P(X = x1) · x1 + … + P(X = xn) · xn = ∑$GH! P(X = i) · xi Standard deviation: = = '?@A(B)

Binomial model
Bernoulli trials Probability
Trials with only two possible outcomes (success and failure), where If there are n Bernoulli trials given with a probability of success p, the probability
the probabilities are p for success and q = 1 – p for failure, and for of having k successful trials can be calculated like this:
which successive trials are independent D
P(X = k) = B(n,p,k) = ( ) · pr · qn-r with
® Binomial model examines the number of successful trials out of a E
total of n Bernoulli trials

Expected value Variance Standard deviation

E(x) = n · p Var(X) = n · p · q = = 'n · p · q

Poisson distribution
Case of binomial distribution with a large number of trials (n → ∞), and a small probability of success (p → 0)
If a random variable X occurs in a Poisson distribution, the probability of having x events per unit of measurement is given by:

P(X = x) = with 2 = Mean number of events per unit of measurement

E(X) = Var(X) = L
= = √L
summaryking.escp.b1 Midterm Formula Sheet Page 4

Individual space

Traditionnel - Farewell To Nova Scotia PDF
100% (1)
Traditionnel - Farewell To Nova Scotia PDF
1 page
Harmonize_2_Unit 3 Test_Basic
No ratings yet
Harmonize_2_Unit 3 Test_Basic
6 pages
TM 1-1520-248-CL (2001 - 1115+ch3 2002 - 1015) OPERATOR'S & CREWMEMBER'S CHECKLIST, OH-58D
100% (1)
TM 1-1520-248-CL (2001 - 1115+ch3 2002 - 1015) OPERATOR'S & CREWMEMBER'S CHECKLIST, OH-58D
87 pages
maxDPUTools A3
No ratings yet
maxDPUTools A3
121 pages
STAB22 Lecture's Notes
No ratings yet
STAB22 Lecture's Notes
64 pages
Comm 215.MidtermReview
No ratings yet
Comm 215.MidtermReview
71 pages
ISOM Cheat Sheet 1
No ratings yet
ISOM Cheat Sheet 1
6 pages
Difference Between (Median, Mean, Mode, Range, Midrange) (Descriptive Statistics)
No ratings yet
Difference Between (Median, Mean, Mode, Range, Midrange) (Descriptive Statistics)
11 pages
Descriptive Statistics Summary (Session 1-5) : Types of Data - Two Types
No ratings yet
Descriptive Statistics Summary (Session 1-5) : Types of Data - Two Types
4 pages
Basic Concepts of Statistics
No ratings yet
Basic Concepts of Statistics
43 pages
Chapter 1: Descriptive Statistics: Example 1: Making Steel Rods
No ratings yet
Chapter 1: Descriptive Statistics: Example 1: Making Steel Rods
20 pages
A. Variables:: Types of Distributions
No ratings yet
A. Variables:: Types of Distributions
10 pages
Statistics For Data Science
No ratings yet
Statistics For Data Science
30 pages
NITKclass 1
No ratings yet
NITKclass 1
50 pages
Week 01 Introduction
No ratings yet
Week 01 Introduction
33 pages
Statistics 091147
No ratings yet
Statistics 091147
60 pages
Descriptive & Inferential Statistics
No ratings yet
Descriptive & Inferential Statistics
6 pages
Statistics
No ratings yet
Statistics
68 pages
Elementary Statistics and Probability Chapter 1 3
No ratings yet
Elementary Statistics and Probability Chapter 1 3
5 pages
BI Statistics Glossary
No ratings yet
BI Statistics Glossary
5 pages
Week 5A - Statistics Handout
No ratings yet
Week 5A - Statistics Handout
9 pages
Bio Statistics
No ratings yet
Bio Statistics
72 pages
Psychology 117 Study Guide
100% (3)
Psychology 117 Study Guide
41 pages
Actuary_Math.Stat._Lec1-9
No ratings yet
Actuary_Math.Stat._Lec1-9
22 pages
Week 5 - Result and Analysis 1 (UP)
No ratings yet
Week 5 - Result and Analysis 1 (UP)
7 pages
02 Exploratory Data Analytics
No ratings yet
02 Exploratory Data Analytics
41 pages
Statistics For Data Science
100% (1)
Statistics For Data Science
27 pages
WK 1 3
No ratings yet
WK 1 3
5 pages
Mathematics Statistics
No ratings yet
Mathematics Statistics
4 pages
Statistics
No ratings yet
Statistics
152 pages
Session 1 On Descriptive Statistics
No ratings yet
Session 1 On Descriptive Statistics
24 pages
Module I. Basic Calculations. Average, Standard Deviation by Excel (5)
No ratings yet
Module I. Basic Calculations. Average, Standard Deviation by Excel (5)
48 pages
Quantitative Data Analysis
No ratings yet
Quantitative Data Analysis
22 pages
Notes: Section 1: Exploratory Data Analysis
No ratings yet
Notes: Section 1: Exploratory Data Analysis
6 pages
Statistics and Probability
No ratings yet
Statistics and Probability
4 pages
Day 01-Basic Statistics
No ratings yet
Day 01-Basic Statistics
36 pages
QR Midterm Memo
No ratings yet
QR Midterm Memo
2 pages
2statsnotes 1
No ratings yet
2statsnotes 1
24 pages
statistics-concept-review
No ratings yet
statistics-concept-review
54 pages
Stats 1 Module Updated
No ratings yet
Stats 1 Module Updated
53 pages
Statistics
No ratings yet
Statistics
30 pages
List of Correction For Applied Statistics Module
No ratings yet
List of Correction For Applied Statistics Module
26 pages
Descriptive Statistics and Exploratory Data Analysis
No ratings yet
Descriptive Statistics and Exploratory Data Analysis
36 pages
ECONOMICS SEM 4 Notes Sakshi
No ratings yet
ECONOMICS SEM 4 Notes Sakshi
10 pages
365 Data Science - Statistics: Glossary Section Lesson Word
No ratings yet
365 Data Science - Statistics: Glossary Section Lesson Word
5 pages
Article Review 1 Eng
No ratings yet
Article Review 1 Eng
30 pages
Biostats Lesson 3
No ratings yet
Biostats Lesson 3
6 pages
6938
No ratings yet
6938
41 pages
Unit II: Basic Data Analytic Methods
No ratings yet
Unit II: Basic Data Analytic Methods
38 pages
Introduction To Data Analysis: Professor David Richardson IIT Stuart School of Business
No ratings yet
Introduction To Data Analysis: Professor David Richardson IIT Stuart School of Business
31 pages
math notes module 4A
No ratings yet
math notes module 4A
4 pages
Statistics
No ratings yet
Statistics
12 pages
Basic Statistics Terms and Calculations
No ratings yet
Basic Statistics Terms and Calculations
4 pages
Introduction To Statistics
100% (3)
Introduction To Statistics
43 pages
AP Statistics Michel Liao
No ratings yet
AP Statistics Michel Liao
20 pages
Chapter 1
No ratings yet
Chapter 1
51 pages
AP Stats - Vocab List
No ratings yet
AP Stats - Vocab List
28 pages
CRP Phase 4-Analyzing and Interpreting Quantitative Data
No ratings yet
CRP Phase 4-Analyzing and Interpreting Quantitative Data
24 pages
Statistics and Its Types(v1.0)
No ratings yet
Statistics and Its Types(v1.0)
6 pages
Data Management ( 1)
No ratings yet
Data Management ( 1)
46 pages
It0089 Finalreviewer
100% (1)
It0089 Finalreviewer
143 pages
Elementary Statistics
From Everand
Elementary Statistics
jay prakash Maheshwari
5/5 (1)
Learn Statistics Fast: A Simplified Detailed Version for Students
From Everand
Learn Statistics Fast: A Simplified Detailed Version for Students
Hesbon R.M
No ratings yet
Técnicas Estadísticas para la Ciencia de Datos a través de R. Aprendizaje Supervisado: Análisis Discriminante, Árboles de Decisión, Redes Neuronales y Modelos Lineales Generalizados
From Everand
Técnicas Estadísticas para la Ciencia de Datos a través de R. Aprendizaje Supervisado: Análisis Discriminante, Árboles de Decisión, Redes Neuronales y Modelos Lineales Generalizados
César Pérez López
No ratings yet
Inclusive Design Design For The Whole Population 1st Edition Roger Coleman pdf download
100% (1)
Inclusive Design Design For The Whole Population 1st Edition Roger Coleman pdf download
29 pages
342043256_bfc0bc81-c3a3-3f5b-a72e-09b705a8ac0f
No ratings yet
342043256_bfc0bc81-c3a3-3f5b-a72e-09b705a8ac0f
8 pages
Chips For Artificial Intelligence
No ratings yet
Chips For Artificial Intelligence
3 pages
Material Comparison For ASTM JIS
No ratings yet
Material Comparison For ASTM JIS
2 pages
Monofaktor LAURA G. SA NCHEZ-LOZADA, EDILIA TAPIA, JOSE SANTAMAR Ia, CARMEN AVILA-CASADO
No ratings yet
Monofaktor LAURA G. SA NCHEZ-LOZADA, EDILIA TAPIA, JOSE SANTAMAR Ia, CARMEN AVILA-CASADO
11 pages
IELTS Listening Practice Test 16 Printable, 2025 revision
No ratings yet
IELTS Listening Practice Test 16 Printable, 2025 revision
10 pages
0702191411597248
No ratings yet
0702191411597248
3 pages
Cost Estimates For FTTP Network Construction: Prepared For City of Santa Cruz, California May 2015
No ratings yet
Cost Estimates For FTTP Network Construction: Prepared For City of Santa Cruz, California May 2015
36 pages
21 ST
No ratings yet
21 ST
20 pages
Comfort Measures Only New and Selected Poems 1994 2016 Rafael Campo - The latest updated ebook version is ready for download
100% (1)
Comfort Measures Only New and Selected Poems 1994 2016 Rafael Campo - The latest updated ebook version is ready for download
68 pages
Adult Health - Soap Note 5
100% (3)
Adult Health - Soap Note 5
3 pages
Road Sign Installation Guidelines
No ratings yet
Road Sign Installation Guidelines
40 pages
Essays Intermediate Part 11 Smart Syllabus: Prepared By: Prof. Sarfraz Shahzad
No ratings yet
Essays Intermediate Part 11 Smart Syllabus: Prepared By: Prof. Sarfraz Shahzad
22 pages
(A) (B) (C) (D) (E) : CME For Family Physicians Dermatology
No ratings yet
(A) (B) (C) (D) (E) : CME For Family Physicians Dermatology
15 pages
Example of JHA Completed
No ratings yet
Example of JHA Completed
4 pages
Grade 7 Integrated Science Week 6 Lesson 1
No ratings yet
Grade 7 Integrated Science Week 6 Lesson 1
3 pages
Power Layout Circuit-R1 (3F)
No ratings yet
Power Layout Circuit-R1 (3F)
1 page
ÅBS Produktinformation Rotogear Eng
No ratings yet
ÅBS Produktinformation Rotogear Eng
25 pages
Aluminium and Its Alloys
100% (1)
Aluminium and Its Alloys
14 pages
Joining Instructions
No ratings yet
Joining Instructions
1 page
Firestarter
No ratings yet
Firestarter
23 pages
Term 1 - Paper 1 Grade 2
No ratings yet
Term 1 - Paper 1 Grade 2
9 pages
LOUIS PASTEUR & First Year at Harrow
No ratings yet
LOUIS PASTEUR & First Year at Harrow
4 pages
Ancient Celts (Barbarians!) PDF
100% (2)
Ancient Celts (Barbarians!) PDF
81 pages
Tutorial 4 (Electrical)
No ratings yet
Tutorial 4 (Electrical)
6 pages
Chromatography+of+Spinach 08
No ratings yet
Chromatography+of+Spinach 08
4 pages

ST Formula Sheet Midterm

Uploaded by

ST Formula Sheet Midterm

Uploaded by

summaryking.escp.

b1 Midterm Formula Sheet Page 1

Example Mean Median Mode

Example: R = 18 – 3 = 15 $*! ("! * %̄ ): 7 ("F * %̄ ): 7 … 7 ("$ * %̄ ):

Table of counts Pie chart Bar graph Pareto diagram

(Bars arranged in descending order)

(Absolute number of a specific event) (Proportion of a specific event within the

Relationship between quantitative variables

Direction Shape Strength Residual

Relationship between categorical variables

(Absolute) Count Two events A and B are

with X2 = 0 ® Total independence

General Addition Rule: P(A∪B) = P(A or B) = P(A) + P(B) – P(A∩B) Requirements:

Tree diagram Bayes’ Rule

Expected value Measures of variability

Expected value Variance Standard deviation

P(X = x) = with 2 = Mean number of events per unit of measurement

You might also like

Example: R = 18 – 3 = 15 $! ("! %̄ ): 7 ("F * %̄ ): 7 … 7 ("$ * %̄ ):