Generalized_Unit_Weibull (7)
Generalized_Unit_Weibull (7)
Mintodê Nicodème Atchadéa,1,, Géoffroy Bill Zoffouna , Mahoulé Jude Bogninoua , Théophile Otodjia , Aliou Moussa
Djibrila
a National Higher School of Mathematical Engineering and Modeling, National University of Sciences, Technologies, Engineering and
Benin
Abstract
This study introduces the Generalized Unit Weibull (GUW) distribution, an extension of the Unit Weibull distri-
bution achieved through transformation and the inclusion of additional parameters. We explore key theoretical
properties of this novel distribution, including stochastic functions, quantile functions and measures, moments,
and Rényi entropy. The model’s unknown parameters are estimated using the maximum likelihood method. To
demonstrate its applicability, we compare the proposed model with existing alternatives using two real-world data
sets, particularly in actuarial science and insurance.
Keywords: Generalized Weibull distribution; Moments; Rényi Entropy; Statistical properties; Quantile regression
∗ Corresponding author
modeling bounded data on the unit interval, particu- Where the distribution’s shape parameter is k > 0, the
larly in fields such as reliability analysis, survival mod- scale parameter is λ > 0.
1
eling, and proportions data. We demonstrate the high Using the transformation y = t α (t = y α ) and then
y xβ
degree of adaptability of the distribution to real-world the transformation x = y+β (y = 1−x ), we have a new
data using two applications: materials engineering and generalized distribution on (0, 1), that we call the GUW
finance. Weibull distribution is widely used because distribution. Given is its CDF by:
of its advantageous attributes, such as its probabilistic βx α k
function’s mathematical simplicity and flexibility. (1−x)
− λ
The article’s remaining sections are organized as fol- F (x, α, β, λ, k) = 1 − e , x ∈ ]0; 1[ (1)
lows: Section (2) presents a description of the Gen-
eralized Unit Weibull (GUW) distribution. Section (3) The hazard rate function (hrf) and related PDF are pro-
addresses some noteworthy characteristics. Sections vided by:
(4) and (5) provide the methodology for actuarial mea-
α k
sures and distribution parameters estimation. Sections
βx
αk−1 (1−x)
−
(6), (7), and (8) are devoted to the simulations, applica- αk β βx λ
f (x, α, β, λ, k) = e ,
tions, and new quantile regression model, in that order. (λ)k (1 − x)2 1−x
Finally, the conclusion is made in the section (9). (2)
and αk−1
αk β βx
2. Generalized Unit Weibull distribution hrf = ,
(λ)k (1 − x)2 1−x
We propose a new generalized distribution with sup- where α, β, λ, k > 0.
port on the unit-interval (0, 1), which arises from a Figures (1), (2), and (3) show the CDF, PDF, and HRF
certain transformation on the two-parameter Weibull of the GUW distribution, respectively. Figure (1) illus-
distribution[28] with probability distribution function trates the flexibility of the cumulative function across
in (PDF) : different parameter settings. Figure (2) shows that the
k−1 PDF can take various shapes, including decreasing, re-
k t t k versed J, or asymmetric. Figure (3) highlights the wide
h(t, λ, k) = e−( λ ) , t > 0, range of possible hazard rate behaviors, such as in-
λ λ
creasing, decreasing, or bell-shaped. This observation
and cumulative distribution function (CDF) : is consistent with prior findings in the literature. These
t k
curvature characteristics are widely understood and
H(t, λ, k) = 1 − e−( λ ) . important for developing universal statistical models.
2
3 SOME MATHEMATICAL FEATURES OF THE GUW DISTRIBUTION
where
mk F (x, σ) = 1 − e−[G(x,σ)]
k+1 i+j−1 αmk − 1 1
Tn = (4)
k j i λ
zm
P+∞
m+αmk+i knowing that : ez = m=0 m!
[−1]
× αmkβ αmk , (5)
m! So,
3
3 SOME MATHEMATICAL FEATURES OF THE GUW DISTRIBUTION 3.2 Rényi Entropy
+∞ m
X [−G(x, σ)] 3.2. Rényi Entropy
F (x, σ) = 1 −
m=0
m!
The Rényi entropy for the distribution is defined as:
+∞ m
X [−1] m
=1− [G(x, σ)] ( +∞ )
m! 1 X ′
m=0 ER(X) = log Tn · Iγ (x, σ) ,
1−γ n=0
m
Let’s develop [G(x, σ)] ,
where
α mk nk+kγ
βx n
′ [−γ] 1 γ
[G(x, σ)]
m
=
(1−x) Tn = β αnk+γαk (αk) ,
n! λ
λ
mk αmk and
1 x
= β αmk , γ γαk−γ+αnk
λ 1−x
Z
1 x
Iγ (x, σ) = dx
R (1 − x)2 1−x
+∞ m mk
X [−1] 1
F (x, σ) = 1 − (7) Proof:
m=0
m! λ
αmk
The Rényi entropy of X in the case of a continuous ran-
x
× β αmk , (8) dom variable is defined by :
1−x
Z
1
ER(X) = log f (x, σ)γ dx , γ ̸= 1, γ sup 1
By differentiating expression (7) with respect to x, we 1−γ R
obtain the series expansion of f (x).
+∞
X [−1]
m mk
1 αmk Considering (2)
f (x, σ) = − β αmk 2
m=0
m! λ (1 − x) kγ γ γαk−γ
γ γ 1 1 βx
x
αmk−1 (f (x, σ)) = (αk) βγ
× λ (1 − x)2 1−x
1−x βx α k
(1−x)
−γ λ
∞
×e .
1 X k+1 k
2 = x ,
(1 − x) k
k=0
γ kλ λ
The terms: (αk) , λ1 , β being constant then their
More explicitly, series developments remain unchanged.
αmk−1 αmk−1 ∞
x αmk−1
X X i+j−1
= (−1)
1−x i=0 j=0
j βx
α k
(1−x)
−γ +∞ n nk αnk
αmk − 1
λ X [−γ] 1 x
× (−1)i xj e = β αnk
i n=0
n! λ 1−x
4
3 SOME MATHEMATICAL FEATURES OF THE GUW DISTRIBUTION 3.3 Moments and associated measures
γ
Thus the series expansion of (f (x, σ)) gives :
+∞ n nk+kγ γ γαk−γ+αnk
γ
X [−γ] 1 αnk+γαk γ 1 x
(f (x, σ)) = β (αk) . (9)
n=0
n! λ (1 − x)2 1−x
Posing :
n nk+kγ
′ [−γ] 1 γ
Tn = β αnk+γαk (αk) ,
n! λ
Z γ γαk−γ+αnk
1 x
Iγ (x, σ) = dx
R (1 − x)2 1−x
we have
( +∞ )
1 X ′
ER(X) = log Tn · Iγ (x, σ)
1−γ n=0
∞ X
∞ αmk−1 ∞ ∞ X
∞ αmk−1 ∞
X X X Tn
X X X Tn
Ms = , (10) Ms =
s+j+k+1 m=0 k=0 i=0 j=0
s+j+k+1
m=0 k=0 i=0 j=0
5
3 SOME MATHEMATICAL FEATURES OF THE GUW DISTRIBUTION 3.4 Moment Generating Function
Figure 4: Mean and variance of the GUW model with λ = 1 and k = 0.09
+∞
!
3.4. Moment Generating Function X (tX)r
Proposition : MX (t) = E
r=0
r!
The moment-generating function is sometimes also
+∞
(tX)r
called the characteristic function. It is used to fully de- X
= E
scribe the distribution of a random variable in terms of r!
r=0
its moments.
The characteristic function is defined as follows: +∞ r
X t
∞ X
+∞ X ∞ αmk−1 ∞ r MX (t) = E (X r ) . (11)
X X X t Tn r=0
r!
MX (t) = ,
r=0 m=0 k=0 i=0 j=0
r! r + j + k + 1
The moment of order r of the distribution is repre-
Proof: sented by E (X r ),
MX (t) = E eXt .
By replacing (10) in (11) we have:
∞ X
+∞ X ∞ αmk−1 ∞ r
Knowing that the development in a series of exponen- X X X t Tn
tial gives : MX (t) = ,
r=0 m=0 k=0 i=0 j=0
r! r + j + k + 1
+∞
X (tx)r
etx = . 3.5. Quantile function
r=0
r!
Proposition:
We can write : We define the quantile function which corresponds to
6
3 SOME MATHEMATICAL FEATURES OF THE GUW DISTRIBUTION 3.5 Quantile function
1 1
1 1
the distribution as follows: βx = λ α [− ln(1 − t)] αk − x λ α [− ln(1 − t)] αk .
1 1
λ α [− ln(1 − p)] αk Let’s arrange terms containing x in a single member
Q(p, σ) = h 1 1
i .
β + λ α [− ln(1 − p)] αk 1 1
1 1
βx + x λ α [− ln(1 − t)] αk = λ α [− ln(1 − t)] αk ,
Proof :
The quantile function is defined as πt , which is the so- β + λ α [− ln(1 − t)] αk ·x = λ α [− ln(1 − t)] αk .
lution to the following nonlinear equation:
t = F (x, σ).
Knowing that α , λ, β are strictly greater than 0, then we
So, have:
βx
α k 1 1
−
(1−x) λ α [− ln(1 − p)] αk
λ
1−t=e . π=h 1 1
i .
β + λ α [− ln(1 − p)] αk
Applying a log transformation to each member of the
equation, we obtain:
The UWG distribution’s 25%, 50%, and 75% quartiles
βx
α k may be found by adjusting p =0.25, p =0.5, and
(1−x) p =0.75, respectively, in equation (3.5).
− ln(1 − t) = (12)
λ Assume that p is evenly distributed (0, 1), in this case,
the following random data sets of size n can be gener-
Let’s raise each number in the equation (12) to the ated by the QF using the GUW distribution:
power (1/k)
1 1
α λ α [− ln(1 − yi )] αk
πi = h i , i = 1, 2, ...., n
1 βx 1 1
λ [− ln(1 − t)] k = . (13) β + λ α [− ln(1 − yi )] αk
(1 − x)
Let’s raise each number in the equation (13) to the Graphs of Bowley and Moor skewness and kurtosis are
power (1/α) shown in (6).
(a) Plot of the Bowley’s coefficient of skewness (b) Plot of the Moor’s coefficient of kurtosis
7
3 SOME MATHEMATICAL FEATURES OF THE GUW DISTRIBUTION 3.6 Survival function
βx
α k
3.6. Survival function (1−x)
− λ
The GUW distribution is characterized by its survival suf (x) = e .
function, which is as described below:
αk−1
3.7. Hazard function αk β βx
haf (x) = .
The GUW distribution’s hazard function may be de- (λ)k (1 − x)2 1−x
scribed as follows:
f (x)
haf (x) = ,
suf (x)
3.8. Cumulative hazard function (cf ) So, the cf of the GUW distribution is as follows:
The GUW distribution is characterized by its cf , which α k
βx
is defined as follows : (1−x)
cf (x) = .
cf (x) = − log(suf (x)), λ
8
3 SOME MATHEMATICAL FEATURES OF THE GUW DISTRIBUTION 3.9 Reserve hazard function(Rf)
βx
α k
3.9. Reserve hazard function(Rf) (1−x)
− λ
Poses : A(x) = e .
The GUW distribution is characterized by its Rf, : so
αk−1
f (x) αk β βx A(x)
Rf (x) = , Rf (x) =
F (x) (λ)k (1 − x)2 1−x 1 − A(x)
3.10. Average absolute deviation mean of µ, the average absolute deviation is calculated
The average absolute deviation indicates how far, on below:
average, each piece of data in a set is from the mean mad(µ) = E(|X − µ|) (14)
of that set. If we consider a GUW distribution with a
9
4 ACTUARIAL MEASURES 3.11 Median absolute deviation (MD)
Z 1
mad(µ) = |x − µ|f (x) dx
0
Z µ Z 1
= (−x + µ)f (x) dx + (x − µ)f (x) dx
0 µ
∞ X
X ∞ αmk−1
X X ∞ Z µ Z 1
= Tn (−xk+j+1 + µxk+j ) dx + (xk+j+1 − µxk+j ) dx
m=0 k=0 i=0 j=0 0 µ
∞ X
∞ αmk−1 ∞
1 − 2µk+j+2 2µk+j+2 − µ
X X X
= Tn + .
m=0 k=0 i=0 j=0
k+j+2 k+j+1
where
1 − 2µk+j+2 2µk+j+2 − µ
Mµ (σ) = + .
k+j+2 k+j+1
3.11. Median absolute deviation (MD) have a GUW distribution with a median of me, the MD
may be stated as follows:
It’s comparable to the MAD, except instead of utilizing
the mean as a reference point, we use the median. If we M D(me) = E(|X − me|), (15)
∞ X
∞ αmk−1 ∞
1 − 2mek+j+2 2mek+j+2 − me
X X X
= Ti,j,k,n × + .
m=0 k=0 i=0 j=0
k+j+2 k+j+1
where
1 − 2mek+j+2 2mek+j+2 − me
Mme (σ) = + .
k+j+2 k+j+1
4. Actuarial measures 4.0.1. VaR measure
The VaR of the GUW distribution is defined by :
1
This section covers the theoretical and practical ele- λα [− ln(1 − q)] αk
ments of numerous essential risk measures. These in- VaRq = h 1
i .
clude Value at Risk (VaR), Average Loss Size over VaR β + λα [− ln(1 − q)] αk
(TVaR), VaR Size (TV), and VaR Probability (TVP) for
the new distribution. Proof. The VaR of a random variable is the quantile of
10
4 ACTUARIAL MEASURES
11
4 ACTUARIAL MEASURES
where
∞ ∞ αmk−1 ∞
1 XX X X
TVq = Tn IV’(σ) − (TVaRq )2
1 − q m=0 i=0 j=0
k=0
12
5 ESTIMATION
and
∞ ∞ αmk−1 ∞
1 XX X X
TVaRq = Tn IV(σ)
1 − q m=0 i=0 j=0
k=0
5. Estimation
Let x1 , x2 , . . . , xm be a random sample of size m from the variable X. Employing the PDF provided in (2), the like-
lihood function may be expressed as follows:
m
Y
ℓ(α, β, λ, k) = f (xk ),
k=1
So, we have :
m m m m m m βx k
kα ln (1−xj )
X
X X X X X βxj j
1
ℓ(α̂, β̂, λ̂, k̂) = ln (αk)−k ln(λ)− ln(β)−2 ln(1−xj )+(αk−1) ln − e
j=1 j=1 j=1 j=1 j=1
1 − xj j=1
λ
We obtain :
α k
βxj
(1−xj )
αk−1 −
m λ
X αk β βx j
ℓ(α̂, β̂, λ̂, k̂) = ln
(λ)k (1 − xj )2 1 − xj e
j=1
α k
m m m m m m βxj
X X X X X βxj X (1−xj )
= ln (αk) − k ln(λ) − ln(β) − 2 ln(1 − xj ) + (αk − 1) ln −
j=1 j=1 j=1 j=1 j=1
1 − xj j=1
λ
Introducing the maximum likelihood estimators α̂, β̂, λ̂, and k̂.
We check:
13
6 A NEW QUANTILE REGRESSION MODEL
The first partial derivatives of l(α̂, β̂, λ̂, k̂) with regard to zero are shown below:
m m βx k
kα ln 1−xj
X
∂l 1 X βxj βxj j 1
= +k ln + k ln e ,
∂α α j=1
1 − xj j=1
1 − xj λ
x α k
m j
∂l 1 (αk − 1) X αk−1 (1−xj )
=− + − αk(β) ,
∂β β β j=1
λ
m m βxj k m βx
kα ln (1−xj )
X
∂l 1 X βxj βxj kα ln 1 X
ln(λ)e−k ln(λ) ,
1−xj j
= − ln(λ) + (α) ln − α ln e + e
∂k k j=1
1 − xj j=1
1 − xj λ j=1
m βx
∂l k X kα ln (1−xjj )
=− + e kλ−k−1 .
∂λ λ j=1
Now that we have established the explicit formulas for To formulate the quantile regression model of the GUW
the partial derivatives concerning each parameter in distribution, we first proceed by making the parame-
the log-likelihood of the model, it becomes evident ter β the subject of the quantile function of the GUW
that solving the system analytically is not a feasible distribution. Then we replace it in CDF and PDF. Thus,
task. The complexity of the equations and the inter- after simplification, we obtain the cumulative distribu-
dependence of variables make it challenging to obtain tion function (QCDF) and quantile probability density
closed-form solutions. In such cases, the application (QPDF) of the GUW distribution.
of numerical methods becomes imperative. Numeri- Poses Q(p, σ) = µ
cal methods provide a practical and efficient approach 1
1
to finding solutions, allowing us to navigate the intri- λ α [− ln(1 − p)] αk
cacies of the system and approximate solutions itera- µ= h 1 1
i
tively. β + λ α [− ln(1 − p)] αk
1 1
1 1
6. A new quantile regression model µβ + µ λ α [− ln(1 − p)] αk = λ α [− ln(1 − p)] αk
The concept of parametric quantile regression recently 1 1
gained popularity due to its robustness in modeling µβ = λ α [− ln(1 − p)] αk (1 − µ)
asymmetric data or data with extreme values. This type
of regression is also effective in dealing with asymmet- so
1−µ
ric and high-tail response variables, which are defined β = G(p)
µ
on the interval (0,1). To implement these regressions, it
is necessary to re-parameterize the probability density where
functions (PDFs) of the distribution in terms of quan- 1 1
tiles, to get the quantile PDF ([29], [30], [31], [32], [4]). G(p) = λ α [− ln(1 − p)] αk
(1−y)
− λ
QF (y, α, β, λ, µ, k, p) = 1 − e , y ∈ ]0; 1[
1−µ
α k
G(p)(
µ )
y
(1−y)
αk−1 − λ
1−µ
G(p) 1−µ
αk G(p) µ µ y
Qf (y, α, β, λ, k, µ, p) = e y ∈ ]0; 1[
(λ)k (1 − y)2 1−y
14
6 A NEW QUANTILE REGRESSION MODEL
Where µ ∈ (0, 1) and p ∈ (0, 1). Figure (15), (16), (17), tilted, decreasing, increasing, symmetrical, J-shaped
and (18) shows respectively the plots of QCDFs and and bathtub-shaped. This shows that the regression
QPDFs for different quantiles and parameter values. model developed from this PDF is flexible enough to
QPDFs come in many shapes, including left- and right- deal with short-interval data with such properties.
15
6 A NEW QUANTILE REGRESSION MODEL 6.1 Estimation of regression parameters
16
7 SIMULATION STUDY 6.2 Regression modeling for educational data
n n n n n n
X X X X 1−µ X X
ℓ(α̂, β̂, λ̂, k̂, p̂) = ln(αk) − k ln(λ) + ln(G(p)) + ln −2 ln(1 − y) + (αk − 1) ln(G(p))+
i=1 i=1 i=1 i=1
µ i=1 i=1
(16)
α k
G(p)·( 1−µ
µ )y
n n n (1−y)
X 1−µ X X
(αk − 1) ln + (αk − 1) ln(y) − (αk − 1) ln(1 − y) −
i=1
µ i=1 i=1
λ
(17)
To get parameter estimates, we set the components of (measured by the mean score of the Cantril scale, also
the score vector to zero while concurrently solving the known as the Self-Anchoring Striving Scale, x2 ).
resultant system of equations. To fit the median regres- The regression model is formulated as follows:
sion, we put p=0.50 in equation (17) and maximize the logit(µi ) = γ0 + γ1 x1 + γ2 x2
log-likelihood function. The parameter standard error where µi represents the median for the GUW and
estimates are computed using the ML method’s large- GUHLG distributions. We calculate maximum like-
sample characteristic. As per [33], the Fisher informa- lihood estimates (MLE), associated standard errors,
tion matrix for parameter standard error estimation is: and estimated log-likelihood values for all models, as
shown in table (1). This reveals that only the coeffi-
∂ 2 ℓ(η|y) cients γ0 and γ1 of the GUW model are significant at the
I(η̂) = −
∂η T ∂η η=η̂ 0.05 threshold, while for the GUHLG model, none of
the coefficients is significant. We also observe a nega-
6.2. Regression modeling for educational data tive relationship between the level of education (repre-
In this section, we carry out an analysis of the real sented by percentage) and the country’s homicide rate,
data to compare the new regression with the Gener- but a positive relationship between the level of edu-
alized Unit Half-Logistic Geometric (GUHLG) regres- cation and satisfaction with life. These results suggest
sion model. The data can be accessed via the link that an increase in life satisfaction is associated with an
https://stats.oecd.org/index.aspx?DataSetCode=BLI increase in the percentage of educational achievement,
[32]. They include three variables: level of education while an increase in the homicide rate corresponds to
(expressed as a percentage of the 35 OECD countries, a decrease in the percentage of educational achieve-
y), homicide rate (as a ratio, x1 ), and life satisfaction ment.
GUW GUHLG
Estimate SE p-value Estimate SE p-value
γ0 12.8942 6.3630 0.0427∗ -1.720182 4.454259 0.699
γ1 -2.3384 1.0181 0.0216∗ 0.010572 0.653999 0.987
γ2 0.2337 0.1547 0.1309 -0.001193 0.076603 0.988
17
8 DATA HANDLING
Table 2: Values for mean, mean bias, and RMSE of simulations for GUW distribution.
Model α β λ k
Rayleigh 1.597854 - - -
19
8 DATA HANDLING 8.2 Dataset II
Model α β λ k
Rayleigh 7.613136 - - -
20
10 FUTURE WORK
After examining tables (3), (4), (5), (6), as well as figures (19) and (20), we can conclude that the GUW model proves
to be more compatible with datasets I and II than competing models. Its flexibility enables it to adapt to domains as
varied as materials engineering and unemployment insurance data.
22