0% found this document useful (0 votes)
69 views30 pages

Truncated Binomial Distribution Analysis

The document contains data from two data sets with 900 observations each. The data is analyzed to fit a truncated binomial distribution. The maximum value is 9, minimum is 2, and sample size is 900 for each data set. Using the method of moments and fixed point iteration, the value of p is estimated to be 0.764647323348295 for the first data set. A frequency polygon showing the expected frequencies from the fitted truncated binomial distribution is generated along with a column diagram of the observed frequencies from the data set.

Uploaded by

Ahel Kundu
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as XLSX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
69 views30 pages

Truncated Binomial Distribution Analysis

The document contains data from two data sets with 900 observations each. The data is analyzed to fit a truncated binomial distribution. The maximum value is 9, minimum is 2, and sample size is 900 for each data set. Using the method of moments and fixed point iteration, the value of p is estimated to be 0.764647323348295 for the first data set. A frequency polygon showing the expected frequencies from the fitted truncated binomial distribution is generated along with a column diagram of the observed frequencies from the data set.

Uploaded by

Ahel Kundu
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as XLSX, PDF, TXT or read online on Scribd

Name- Ahel Kundu MS-EXCEL Gr-8

Roll No. STAT02 1st data set

5 7 6
6 8 6
9 7 6
6 7 6
7 8 5
6 8 6
6 6 7
7 7 8
6 7 9
7 4 8
6 6 6
8 6 8
7 5 7
8 8 8
7 7 6
5 7 7
9 6 7
7 7 7
6 4 7
8 6 8
6 9 6
6 9 9
7 6 7
8 7 8
8 7 9
6 8 7
5 7 5
9 7 8
6 6 8
8 7 7
9 8 5
6 8 8
8 8 9
9 8 6
8 5 8
6 5 7
8 7 7
7 4 5
6 8 4
7 8 6
6 7 9
9 6 9
5 6 7
6 7 8
9 7 5
8 7 4
8 7 8
6 6 7
9 8 8
8 8 7

Max value: 9
Min value: 2
Sample size(y): 900

Now I will try to fit the truncated binomial distribution for the guven data set

We know if X follows Bin(n,p) then the pmf of X is,

Now, if we truncate the values of X over the interval [2,m] i.e. if X' follows {0,1} Truncated

P(X=x) = { nCxpx(1-p)n-x /(1 - (P(X=0)+P(X=1))) for x=2,3,4….n


0 otherwise

Now,
E(Xr) = np(1−qn−1 )

Here, from the given data we can see that n=9.

Now, by Method of Moments we get, E(X')=

Here to estimate the value of p we will use fixed point iteration method,

For that we will take φ(p) = (6.882222222∗[1−(1−p)^9−9p(1−p)^8])/9*(1-((1-p)^8)) su


Now, since pϵ(0,1) so we take p=0.5 as our initial guess for estimating the value of p

Table 1.1.2
p φ(p)
0.5 0.752696199443137
0.752696199443137 0.764626933909879
0.764626933909879 0.764647293994791
0.764647293994791 0.764647323306109
0.764647323306109 0.764647323348295

Table 1.1.3
Observed Frequency Observed Value(x) Estimated Probability Mass(P(X'=x))
1 2 0.000841953178904
8 3 0.006382734155715
28 4 0.031105704730873
84 5 0.10106064717713
211 6 0.218893490143278
255 7 0.304788158885091
232 8 0.247559803381356
81 9 0.089367508347653
Total 900 1

Fig:1.1.2 Fitted frequency polygon(for expected frequency) for Tr


Distribution together with a column diagram(for observed frequ
dataset
Column diagram(for observed frequency) Frequency polygon(for expected freq
300
Fig:1.1.2 Fitted frequency polygon(for expected frequency) for Tr
Distribution together with a column diagram(for observed frequ
dataset
Column diagram(for observed frequency) Frequency polygon(for expected freq
300

FREQUENCY→
250

200

150

100

50

0
1 2 3 4 5 6
OBSERVED VALUE →
Registration Number: 19214110002
UG SEM 3

9 7 6 8 6 7
8 6 8 3 8 7
6 9 8 6 9 8
8 6 9 5 7 8
8 8 5 8 4 9
8 6 4 6 7 8
8 8 5 9 9 7
6 8 7 7 8 6
8 9 6 8 6 7
7 5 6 6 6 8
4 8 8 7 6 5
8 8 6 5 6 8
8 8 6 7 8 6
5 8 5 8 7 2
4 8 5 5 6 6
7 8 7 5 7 5
5 8 7 5 7 5
6 6 8 7 6 8
7 7 5 8 9 6
8 6 6 9 7 7
7 6 7 9 9 5
6 9 6 9 8 8
9 6 9 4 6 6
8 7 8 5 7 6
6 7 4 7 8 9
6 7 8 3 4 7
6 8 6 7 7 8
6 7 7 7 8 9
8 6 7 9 7 7
5 8 8 6 7 8
8 8 6 9 8 7
6 5 5 8 7 7
7 6 7 8 7 6
7 6 6 6 7 7
6 7 6 8 7 8
7 6 8 7 9 5
6 9 6 3 3 8
8 6 6 5 7 8
5 7 7 6 5 5
3 7 4 8 5 4
7 5 4 5 8 7
6 5 7 5 5 7
9 8 6 6 8 7
8 7 8 7 6 7
8 7 6 4 6 5
7 6 8 7 8 6
6 5 6 7 8 8
8 5 5 5 8 8
4 8 8 8 9 8
7 6 8 7 8 7

Table 1.1.1
Frequency distribution of 900 observations
Observation (xi) Frequency(fi) xifi xi2fi
2 1 2 4
3 8 24 72
4 28 112 448
5 84 420 2100
6 211 1266 7596
7 255 1785 12495
8 232 1856 14848
9 81 729 6561
Total 900 6194 44124

tion for the guven data set

P(X=x) = nCxpx(1-p)n-x when x=0,1,2,….n

al [2,m] i.e. if X' follows {0,1} Truncated Bin(n,p) then, the pmf of X' is,

or x=2,3,4….n
otherwise

6.8822222222 (The mean of the data set)

oint iteration method,

−p)^9−9p(1−p)^8])/9*(1-((1-p)^8)) such that p=φ(p)


uess for estimating the value of p

Here in the last step we observe that |p-φ(p)|<0.0000000001 i.e.

Hence we can approximate the value of p by p' = 0.764647323

Fig:1.1.1 Scatter diagram showing the estimated probability mass a


Expected Frequency (y*P(X'=x))
0.757757861013384
of the given dataset for the fitted Truncated Binomial Distribution
5.74446074014366
27.9951342577853
90.9545824594173
0.35
197.00414112895
274.309342996582
0.3
222.803823043221
80.430757512888
0.25 900
Estimated Probability Mass

0.2

0.15

0.1

0.05

0
1 2 3 4 5 6 7

Observed Values

pected frequency) for Truncated Binomial


gram(for observed frequency) on the given

Frequency polygon(for expected frequency)


pected frequency) for Truncated Binomial
gram(for observed frequency) on the given

Frequency polygon(for expected frequency)

Comment:
From fig:1.1.2 we can see

5 6 7 8
RVED VALUE →
7 8 8 4 8 7 8 7
7 7 6 6 6 7 6 7
9 7 8 7 8 8 7 5
6 7 9 5 6 6 3 7
8 7 6 8 7 7 4 4
8 8 5 6 6 7 5 6
6 7 6 8 6 8 8 7
7 6 6 7 6 7 7 7
9 8 6 7 7 8 9 7
7 4 8 7 7 7 6 7
8 7 8 8 8 6 6 9
8 7 6 7 6 5 7 8
8 6 7 9 7 6 9 6
6 7 6 7 7 7 7 5
9 7 8 9 8 7 7 7
4 5 6 7 6 8 6 6
6 8 7 8 7 7 6 8
6 7 7 9 9 7 7 8
8 5 7 7 7 6 7 7
8 8 8 8 7 7 6 8
5 6 5 5 7 7 9 7
8 9 7 6 6 8 7 8
8 6 8 7 7 9 8 6
9 8 8 7 6 8 9 7
8 6 8 7 8 8 5 8
9 6 9 7 8 7 7 6
7 5 7 6 7 6 7 8
8 8 8 9 6 7 8 8
7 6 9 7 5 6 9 9
6 9 7 6 6 6 7 8
7 8 8 7 6 7 8 5
4 6 7 8 8 7 7 8
9 8 8 5 8 6 7 5
8 7 6 7 8 7 5 6
7 6 6 9 9 8 8 5
6 8 6 6 5 6 7 7
6 7 6 6 6 8 8 6
7 8 7 7 9 3 7 7
9 7 8 6 9 5 6 9
6 7 9 8 7 6 7 6
4 4 8 8 7 6 5 7
3 8 6 8 6 8 8 8
7 7 6 8 8 8 8 6
8 5 8 6 9 8 6 4
8 6 8 6 6 5 9 6
6 9 7 8 8 7 7 7
4 9 6 6 7 5 7 7
5 6 6 8 7 5 7 8
7 6 4 9 5 9 7 7
7 7 6 6 5 9 7 5

Mean : 6.88222222
Variance: 1.66168395
Standard Deviation : 1.28906321

en x=0,1,2,….n (where, pϵ[0,1] be the success probability and nϵ{0,1,2,…..} be the no. of trials)
φ(p)|<0.0000000001 i.e. both are almost same values

p by p' = 0.764647323

obability mass against the observed values


al Distribution

6 7 8 9 10

lues
From fig:1.1.2 we can see that the frequency polygon of the truncated binomial distribution fits on the column diagram of the given dat
6
8
6
9
7
6
7
8
5
6
8
7
7
8
7
9
7
8
8
7
8
7
6
7
6
6
6
8
8
5
5
9
6
8
7
8
7
7
7
8
8
7
6
5
5
7
7
8
8
6
n diagram of the given data very well i.e. we can assume that our given data has been taken from a {0,1} truncated binomial (n=9,p=0.7
ated binomial (n=9,p=0.764647) probability distribution(or population).
Now we will try and fit a Truncated Poisson distribution to the given dataset

We know that if X follows Poi(λ) then the pmf of X is:-

P(X=x) ={ (e-λ λx)/x! For x=1,2,3,4…∞


0 otherwise

Now, if we truncate the values of X over the interval [2,9] i.e. if Z follows Truncated Poi(λ) distribution

P(Z=z) = { ((e-λ λx)/x! )/(Σ𝑒(e−λ λi)/i! ) for i=2,3,..9


0 otherwise

Now we know that the ,


E(Z) = λ(1-(λ8/9!))/(Σ𝑒(e−λ λi)/i! ) where i=2,3…9

Now by the method of moments we get ,


E(Z) = 6.8822222222

Here to estimate the value of λ we will use fixed point iteration method,

For that we will take φ(λ) = 6.88222222 - (1-(λ8/9!))/(Σ𝑒(e−λ λi)/i! ) where i=2,3…9 such that λ=φ(λ)

Now, since λ>0 so we can take any positive real number λ as our initial guess for estimating the value
But, we know if X follows Poi(λ) then, E(X) = λ
Hence, here though X follows truncated poisson but we can take λ=6.882 (=E(Z))as a good initial gues

Table 1.2.1
λ φ(λ)
6.882 7.63178372566
7.6317837257 8.02793303617
8.0279330362 8.25996958845
8.2599695885 8.40264808263
8.4026480826 8.49274024983
8.4927402498 8.55052328989
8.5505232899 8.58794130264
8.5879413026 8.61231864607
8.6123186461 8.62826173685
8.6282617369 8.63871484011
8.6387148401 8.64557959595
8.6455795959 8.65009261411
8.6500926141 8.65306162584
8.6530616258 8.65501576534
8.6550157653 8.65630232488
8.6563023249 8.65714953312
8.6571495331 8.65770749821
8.6577074982 8.65807500141
8.6580750014 8.65831707072
8.6583170707 8.65847652439
8.6584765244 8.65858156081
8.6585815608 8.65865075224
8.6586507522 8.65869633172
8.6586963317 8.65872635716
8.6587263572 8.65874613648
8.6587461365 8.6587591662
8.6587591662 8.65876774959
8.6587677496 8.65877340396
8.658773404 8.65877712881
8.6587771288 8.65877958259
8.6587795826 8.65878119903
8.658781199 8.65878226387
8.6587822639 8.65878296535
8.6587829653 8.65878342745
8.6587834274 8.65878373186
8.6587837319 8.6587839324
8.6587839324 8.6587840645
8.6587840645 8.65878415152
8.6587841515 8.65878420885
8.6587842089 8.65878424662
8.6587842466 8.6587842715
8.6587842715 8.65878428788
8.6587842879 8.65878429868
8.6587842987 8.65878430579
8.6587843058 8.65878431048
8.6587843105 8.65878431356
8.6587843136 8.6587843156
8.6587843156 8.65878431694
8.6587843169 8.65878431782
8.6587843178 8.6587843184

Here, from the last step of table:1.2.1 we can observe that |φ(λ) - λ|

Hence, we can approximate the value of λ as, λ̂ = 8.658783

Observed Frequency
1
8
28
84
211
255
232
81
Total 900

Fig:1.2.2 Fitted frequency polygon(for expec


Distribution together with a column diagram

Column diagram(for observed frequency)

300
Frequency →

250

200

150

100

50

0
1 2 3

Observed V
100

50

0
1 2 3

Observed V
s Truncated Poi(λ) distribution over the interval [2,9] then, the pmf of Z is,

or i=2,3,..9
otherwise

re i=2,3…9 such that λ=φ(λ)

guess for estimating the value of λ.

2 (=E(Z))as a good initial guess for estimating the value of λ.


we can observe that |φ(λ) - λ|< 0.000000001 i.e. two consecutive guesses are almost same

of λ as, λ̂ = 8.658783

Table 1.2.2
Observed value(z) (λ^z)/z! Estimated Probability Mass(P(Z=z)) Expected Frequency(n*P(Z=z))
2 37.4873 0.010319884618221 9.28789615639847
3 108.19802092 0.02978588049807 26.8072924482628
4 234.21579605 0.06447736892418 58.0296320317616
5 405.60475064 0.111659109185083 100.493198266575
6 585.34058659 0.161138666067823 145.024799461041
7 724.04816005 0.199323534627249 179.391181164524
8 783.67198743 0.215737404141292 194.163663727163
9 753.96063137 0.207558151938083 186.802336744275
3632.5272 1 900

polygon(for expected frequency) for Truncated Poisson


a column diagram(for observed frequency) on the given dataset

am(for observed frequency) Frequency polygon(for expected frequency)

3 4 5 6 7 8

Observed Values of the given dataset →


3 4 5 6 7 8

Observed Values of the given dataset →


2 0.01031988462
Fig:1.2.1 Scatter diagram showing the estimated probability mass
3 0.0297858805
given dataset for the fitted Truncated Poisson Distribution
4 0.06447736892

0.25
Fig:1.2.1 Scatter diagram showing the estimated probability mass
given dataset for the fitted Truncated Poisson Distribution
5 0.11165910919
0.25
6 0.16113866607
7 0.19932353463
8 0.21573740414
Estimated Probability Mass ->
0.2
9 0.20755815194

0.15

0.1

0.05

0
1 2 3 4 5 6

Observed Value of data set ->

Comment: From fig:1.2.2 we can see that the frequency polygon of the truncated poisson
ted probability mass against the observed values of the
Distribution
ted probability mass against the observed values of the
Distribution

5 6 7 8 9 10

served Value of data set ->

y polygon of the truncated poisson distribution does not fit on the column diagram of the given data i.e. it is clear that our given data has n
r that our given data has not taken from a truncated poisson population.

You might also like