Statistics 05.05
Statistics 05.05
∑ 𝑛𝑖 = 𝑛 ; ∑ 𝑓𝑖 = 1
𝑖=1 𝑖=1
𝑛1 = 𝑁1 ≤ 𝑁2 ≤, … , ≤ 𝑁𝑘 = 𝑛;
𝑓1 = 𝐹1 ≤ 𝐹2 ≤, … , ≤ 𝐹𝑘 = 1.
∑ 𝑛𝑖𝑥 = 𝑛; ∑ 𝑛𝑥𝑗 = 𝑛;
𝑖= 1 𝑗= 1
𝑘 𝑙
∑ 𝑓𝑖𝑥 = 1; ∑ 𝑓𝑥𝑗 = 1;
𝑖= 1 𝑗= 1
The mean of the sum of two variables is the sum of their means.
𝑥+𝑦 =𝑥+𝑦
“e.g., given variables 𝑓(𝑥, 𝑦), 𝑥 + 𝑦 = 5 + 2 = 7.
𝑥 = [1, 2, 3]; 𝑥 = 2
𝑦 = [4, 5, 6]; 𝑦 = 5
The sum of deviations from the mean is equal zero, reflecting its
balance point property. Hence, the mean minimizes the sum of
squared deviations, as least squares property.
𝑛
∑(𝑥𝑖 − 𝑥) = 0;
𝑖=1
Calculation Methods,
2.1.1.1 Raw Data, Exact Calculation of the Mean
𝑛
1 𝑛𝑖 · 𝑥𝑖
𝑥̅ = ∑ 𝑥𝑖; 𝑥̅ =
𝑛 𝑛
𝑖=1
“e.g., for a data set “x”
{𝑥1 2, 𝑥2 3, 𝑥3 4, 𝑥4 5, 𝑥5 8, 𝑥6 2, 𝑥7 4, 𝑥8 5, 𝑥9 7, 𝑥10 2} = 𝑛10;
𝑛
1 𝑛𝑖 · 𝑥𝑖
𝑥̅ = ∑ 𝑥𝑖; 𝑥̅ =
𝑛 𝑛
𝑖=1
1
{𝑥 2, 𝑥 3, 𝑥 4, 𝑥 5, 𝑥 8, 𝑥 2, 𝑥 4, 𝑥 5, 𝑥 7, 𝑥 2} = 4,2;
10 1 2 3 4 5 6 7 8 9 10
{𝑥1 2, 𝑥2 3, 𝑥3 4, 𝑥4 5, 𝑥5 8, 𝑥6 2, 𝑥7 4, 𝑥8 5, 𝑥9 7, 𝑥10 2}
= 4,2.
10
1 {𝑥1 2, 𝑥2 3, 𝑥3 4, 𝑥4 5, 𝑥5 8, 𝑥6 2, 𝑥7 4, 𝑥8 5, 𝑥9 7, 𝑥10 2}
𝑥= ;
10 {𝑛𝑖1 3, 𝑛𝑖2 1, 𝑛𝑖3 2, 𝑛𝑖4 2, 𝑛𝑖5 1, 𝑛𝑖6 1}
{𝑥1 2, 𝑥2 3, 𝑥3 4, 𝑥4 5, 𝑥5 8, 𝑥6 2, 𝑥7 4, 𝑥8 5, 𝑥9 7, 𝑥10 2}
{𝑛𝑖1 3, 𝑛𝑖2 1, 𝑛𝑖3 2, 𝑛𝑖4 2, 𝑛𝑖5 1, 𝑛𝑖6 1}
= 4,2;
10
1
𝑥 = ∑ {𝑛𝑖1 3, 𝑛𝑖2 1, 𝑛𝑖3 2, 𝑛𝑖4 2, 𝑛𝑖5 1, 𝑛𝑖6 1} = 4,2;
10
Statistical Science, Autonomous University of Barcelona
Paper nº 102386.1 Descriptive Statistics
𝑥̅ = ∑ 𝑓𝑖 · 𝑥𝑖
𝑖=1
{𝑥1 2, 𝑥2 3, 𝑥3 4, 𝑥4 5, 𝑥5 8, 𝑥6 2, 𝑥7 4, 𝑥8 5, 𝑥9 7, 𝑥10 2}
𝑥= ;
{𝑓𝑖1 0,3, 𝑓𝑖2 0,1, 𝑓𝑖3 0,2, 𝑓𝑖4 0,2, 𝑓𝑖5 0,1, 𝑓𝑖6 0,1}
𝑥 = ∑ 𝑓𝑖 {𝑓𝑖1 0,3, 𝑓𝑖2 0,1, 𝑓𝑖3 0,2, 𝑓𝑖4 0,2, 𝑓𝑖5 0,1, 𝑓𝑖6 0,1} = 4,2;
Calculation Method,
2.1.2.1 Odd Sample Size
The data has to be algebraic ordered,
“e.g., {3,1,4,2,5,1}; {1,1,2,3,3,4,5}; 𝑀 = 3, 4𝑡ℎ 𝑥𝑖.
2+3
[2, 3); 𝑐𝑖2 = ; 𝑐𝑖2 = 2,5 ; 𝑛𝑖 = 3
2
3+4
[3, 4); 𝑐𝑖3 = ; 𝑐𝑖3 = 3,5 ; 𝑛𝑖 = 2
2
{𝑐𝑖1 1.5, 𝑐𝑖2 2.5, 𝑐𝑖3 3.5}
{𝑛𝑖1 4, 𝑛𝑖2 3, 𝑛𝑖3 2}
𝑛=9 4,5 − 3,5
= 4,5; 𝑀 = (𝐿 = 3) + ( ) 1;
2 1
𝑀 = 3 + (4,5 − 3,5); 𝑀 = 3 + 1; 𝑀 = 4.
2.1.3.1 Quartiles
The quartiles divide a data set in 4 parts,
𝑄1 ≥ 0,25; 𝑄2 ≥ 0,5; 𝑄3 ≥ 0,75
As the vale of 𝑄1 , 𝑄3 represents a proportion of 0,25,
mid quartile 𝑄2 is equal to the median.
𝑄2 ≥ 0,5 = 𝑀;
Calculation Methods,
2.1.3.1 Linear Interpretation
𝑄1 = 0,25(𝑛 + 1);
𝑄3 = 0,75(𝑛 + 1);
“e.g., {1,2,3,4,5,6,7,8,9}𝑛9
𝑄1 = 0,25(𝑛9 + 1) = 2,5;
𝑄3 = 0,75(𝑛9 + 1) = 7,5;
2.1.3.2 Quantiles
Represents a special case of quartiles,
which divide data into any number of equal parts.
Percentiles are divided in hundred, representing the percentage,
𝑛
𝑘 100 − 𝐹
𝑃𝑘 = 𝐿 + ( ) 𝑤;
𝑓
“e.g., {1,3,5,7}; 𝑥̅ 𝑣 = 4;
1 + 3 + 5 + 7 16
𝑥̅ = = = 4,
4 4
(1 − 4)2 = (+3)2 = 9; (3 − 4)2 = (+1)2 = 1;
(5 − 4)2 = (+1)2 = 1; (7 − 4)2 = (+3)2 = 9;
Statistical Science, Autonomous University of Barcelona
Paper nº 102386.1 Descriptive Statistics
𝑛 𝑛
𝑖=1
Properties of the Variance,
Is the minimizer of the Mean Quadratic Error,
𝑀𝑄𝐸(𝑣) ≥ 𝑀𝑄𝐸(𝑥̅ )
Variance is in squared units, so often is applied standard deviation
for the interpretation in a real scale,
𝑛
√𝑆 2 = 𝑀𝑄𝐸(𝑣) = ∑ √(𝑥𝑖 − (𝑣 = 𝑥̅ ))2
𝑖=1
𝑆 = ∑ 𝑓𝑖(𝑥𝑖 − 𝑥̅ )2
2
𝑖=1
𝑥̅ = ∑(𝑣𝑖 · 𝑓𝑖) ;
(𝑥1 70 · 𝑓1 0,2) + (𝑥2 80 · 𝑓2 0,5) + (𝑥3 90 · 𝑓3 0,3) = 81,
𝑛
𝑖=1
44.444,2 + 22.221,6
𝑆2 = = 2.222,193.
30
√𝜎 2 = √2.222,193 = 47,14
Statistical Science, Autonomous University of Barcelona
Paper nº 102386.1 Descriptive Statistics
2.2.6.1 Covariance
Covariance qualifies how many variables vary together,
𝑛
1
𝑆(𝑥, 𝑦) = ∑ √(𝑥𝑖 − 𝑥̅ )2 (𝑦𝑖 − 𝑦̅)2 ;
𝑛
𝑖= 1
𝑛
1
𝑆(𝑥, 𝑦) = ∑(𝑥𝑖 − 𝑥̅ )(𝑦𝑖 − 𝑦̅);
𝑛
𝑖= 1
2.3.1 Skewness
Qualifies how lopsided the data is relative to the mean, measuring
an asymmetry of a concrete data set.
1 𝑥𝑖 − 𝑥̅ 3
𝑎. 𝑐 = ∑ ( ) ;
𝑛 𝑆
2.3.1.1 Symmetric Skewness
1 𝑥𝑖 − 𝑥̅ 3
𝑎. 𝑐 = ∑ ( ) = 0;
𝑛 𝑆
𝑥̅ = 𝑀 = 𝑚;
{2,3,4,5,6}
{2,3,4,5,6}𝑛5; 𝑥̅ = = 4;
5
(2 − 4)2 + (3 − 4)2 + (4 − 4)2 + (5 − 4)2 + (6 − 4)2
{4 + 1 + 0 + 1 + 4}
𝑆=√ ; 𝑆 = √2 ≈ 1.41
5
1 𝑥𝑖 − 4 3 1 2 − 4 3 1 3 − 4 3
𝑎. 𝑐 =∑( ) = ( ) + ( )
5 1.41 5 1.41 5 1.41
1 4−4 3 1 5−4 3 1 6−4 3 0
+ ( ) + ( ) + ( ) = = 0.
5 1.41 5 1.41 5 1.41 5
2.3.1.2 Positive Right –Skewed
1 𝑥𝑖 − 𝑥̅ 3
𝑎. 𝑐 = ∑ ( ) > 0 ; 𝑥̅ > 𝑀;
𝑛 𝑆
{35,40,45,50,55,60,65,70,150,300}𝑛10;
{35,40,45,50,55,60,65,70,150,300} 870
𝑥̅ = = = 87;
10 10
{55, 60} 115
𝑀= = = 57,5
2 2
2.3.2.3 Negative Left –Skewed
1 𝑥𝑖 − 𝑥̅ 3
𝑎. 𝑐 = ∑ ( ) < 0 ; 𝑥̅ < 𝑀;
𝑛 𝑆
{1,6,7,8,9}
{1,6,7,8,9}𝑛5; 𝑥̅ = = 6,2; 𝑀 = 7.
5
Statistical Science, Autonomous University of Barcelona
Paper nº 102386.1 Descriptive Statistics