Ngesa Et Al 2014 A Flexible Random Effects Distribution in Disease Mapping Models
Ngesa Et Al 2014 A Flexible Random Effects Distribution in Disease Mapping Models
(2014) 48, 83 – 93 83
Key words: Spatial statistics, Disease mapping, Bayesian analysis, Random effects, Generalized
Gaussian distribution
Summary: Disease mapping has seen many applications in epidemiology and public health. The
basic model used in disease mapping is the Besag, York and Mollie model, which incorporates two
random effects, one which is spatially structured and the other random effect which is spatially un-
structured. The normality assumption on the spatially unstructured random effect is very common.
In this work, we investigate a more robust spatially unstructured random effect distribution by con-
sidering the symmetric generalized Gaussian distribution in the disease mapping problem. The dis-
tribution has the normal and Laplace distributions as special cases. The inferences under this model
are carried out under the Bayesian approach implemented in WinBUGS. The generalized Gaussian
distribution is introduced in WinBUGS using zero tricks. The usefulness of the proposed model is
investigated with a simulation study and applied in real data; mapping tuberculosis in Kenya. In
this paper we showed that the generalized Gaussian distribution can produce better results when the
normality assumption is violated due to high peakedness or less peakedness in the data. For the case
of data in which the random effects are truly normal, the generalized Gaussian distribution adjusts
to a normal distribution as dictated by the data itself.
1
Corresponding author.
AMS: 62H11, 62F15
84 NGESA, ACHIA & MWAMBI
1. Introduction
Disease mapping refers to the estimation and presentation of summary measures of spatially ob-
served health outcomes. The increased availability of georeferenced data and flexible computa-
tional softwares has seen rise in application of disease mapping in the areas of epidemiology and
public health, (Rezaeian, Dunn, St Leger and Appleby, 2007; Everitt and Dunn, 2011). Disease
mapping can be used to describe geographical variation of diseases, identify clustering of diseases
and generate atlas of diseases. A number of statistical reviews on disease mapping have been done
(Clayton and Bernardinelli, 1992; Smans and Esteve, 1997; Wakefield, Best and Waller, 2000; Wake-
field, 2007; Manda, Feltbower and Gilthorpe, 2011).
The backbone model for univariate disease mapping is the Besag, York and Mollie (BYM) model
proposed by Besag, York and Mollie (1991). This model is a form of the generalized linear mixed ef-
fects model, with two random effects; a spatially unstructured random effect which is modelled using
a normal prior and a spatially structured random effect which is modelled using an intrinsic condi-
tional autoregressive (ICAR) prior. The use of normal distribution to model the spatially unstructured
random effects is mainly because of its computational simplicity. Assumption of normality on the
uncorrelated random effect in models is common. Sometimes this assumption is incorrect because
some random effects can in fact be platykurtic, leptokurtic or skewed; diverging from this general
normality assumption, see Box and Tiao (1973). When this normality assumption is violated, there
is need to consider other models that would better suit the data at hand. The generalized Gaussian
distribution can be used in cases where there is deviation from the normal kurtosis (kurtosis = 3)
and when there is evidence of skewness in the data. It is a generalization of the common normal
distribution to allow for these departures.
The general Gaussian distribution has two versions, both of which add a shape parameter to the
normal distribution. The first version of the generalized Gaussian distribution includes normal and
Laplace distributions. The continuous uniform distribution arises naturally as a limiting case for
this distribution. All the distributions encompassed under this family are symmetric. In the second
version, the shape parameter is used to incorporate skewness in the family of distributions. Positive
values of the shape parameter produce distributions which are skewed to the left while negative
values lead to right skewed distributions. In this work, we concentrate on the symmetric version of
the generalized Gaussian distribution. The generalized Gaussian distribution, which we will denote
by GGD, has three parameters, the location parameter, µ, the scale parameter, σ 2 and the shape
parameter, φ . The shape parameter dictates the amount of peakedness or kurtosis.
This work is structured as follows: in section 2, we review the BYM model, in section 3 we
introduce the symmetric generalized Gaussian distribution and discuss its limiting distributions, in
section 4, we carry out a simulation to study the effect of misspecifying the random effects, in
section 5 we use the discussed models to analyze the tuberculosis (TB) data from Kenya and finally
discussions and conclusions in section 6.
the unknown relative risk for region i with respect to a standard population. Also let yi denote the
observed counts of disease in region i and ei denote the expected count in the same region. The
model assumed that the log of relative risk of disease can be broken down into a spatially structured
component ui and a spatially unstructured component vi . This can be written mathematically as
with
log (λi ) = ui + vi , (2)
where ui and vi are random effects representing unobserved covariates, with ui representing vari-
ables that if were observed would influence the spatial structure, while vi represents the unobserved
heterogeneity in region i. Besag et al. (1991) noted that in most cases, one of the random effects
usually dominates the other. If u is stronger than v, then the estimated risk will show spatial structure
and if v is stronger than u then the consequence will be to shrink the estimated means towards the
overall mean. Besag et al. (1991) assumed that u and v were independent with the following priors:
( )
−n 1 n 2
p(v|τ) ∝ τ exp − ∑ vi ,
2 (3)
2τ i=1
and ( )
−n 1 2
p(u|k) ∝ k 2 exp − ∑ ∑ (ui − u j ) . (4)
2k i j∈N(i)
Basically, equation (3) means that v, the spatially unstructured component is a white noise Gaussian
process with unknown variance τ, and equation (4) means that the spatially structured component
u, is a Gaussian Markov random field (GMRF) process with variance k, with n being the number of
regions under study.
This implies that the conditional distributions of each ui , given the rest, are given by
!
∑ j∈N(i) u j k
(ui |u−i ) ∼ N , , (5)
di di
with
∑ j∈N(i) u j
E (ui |u−i ) = (6)
di
and
k
Var (ui |u−i ) = , (7)
di
where N(i) and di are respectively the set and number of neighbours of region i. The neighbourhood
can be defined in terms of Euclidean distance of the centroids of the regions or whether two regions
share a border or not. This conditional distribution for u is called the intrinsic conditional autore-
gressive (ICAR) prior distribution. Besag et al. (1991) sampled the posterior distribution using the
Gibbs sampler, an McMC algorithm.
86 NGESA, ACHIA & MWAMBI
Definition 1 A random variable X is said to have a GGD if its probability density function is given
by !
1 x−µ φ
f (x; µ, σ , φ ) = exp − (8)
2Γ 1 + 1 ζ (φ , σ ) ζ (φ , σ )
φ
" #1
2
σ 2 Γ φ1
where x, µ ∈ R, σ > 0 and ζ (φ , σ ) = . In this expression ζ (φ , σ ) is a scaling factor.
Γ φ3
See Nadarajah (2005) for further discussions on statistical properties of this distribution.
Equation (9) is the Laplace probability density function with location parameter µ and scale param-
eter b.
Equation (10) is the probability density function of a normal random variable with mean µ and
variance σ 2 .
4. Simulation
In this section, we carry out a simulation study to determine the effect of wrongly specifying the dis-
tribution of the random effect in a BYM model. Three scenarios were considered in the simulation.
A FLEXIBLE RANDOM EFFECTS DISTRIBUTION 87
In the first simulation, the datasets is generated through a random effect, v with a peaked kurtosis
as follows: Assuming that there are 60 geographical regions and Oi is the number of disease counts
observed in region i and Ei is the corresponding expected counts in that region. Without loss of
generality, we further assume that no covariates are available for use.
4. Step 4: Calculate λ = E × θ
We fitted two Bayesian hierarchical models for the data set. The models were specified based on
different assumptions on the random effects as follows:
Oi ∼ Poisson(µi ) (11)
and
log(µi ) = log(Ei ) + vi (12)
with
Oi ∼ Poisson(µi ) (14)
with
where O is the observed counts of cases of TB and E is the expected count of cases of TB. Model
estimation was carried out using a Bayesian approach. All parameters in the models were assigned
prior distributions. In this analysis, a non-informative normal prior was assigned to the fixed effect
coefficient β0 , the shape parameter φ was given a diffuse, uniform prior, and the variance param-
eters were assigned inverse gamma distributions. The models were implemented using WinBUGS
version 1.4 (Spiegelhalter, Thomas, Best and Lunn, 2007; Ntzoufras, 2011). For each model, 50,000
Markov chain Monte Carlo (McMC) iterations were ran, with the initial 10,000 discarded to cater
for the burn-in period and thereafter keeping every tenth sample value. The 4,000 iterations left were
used for assessing convergence of the McMC and parameter estimation. We assessed McMC con-
vergence of all models parameters by checking trace plots and autocorrelation plots of the McMC
output, see Gelman, Carlin, Stern and Rubin (2003). The models were compared using the Deviance
Information Criterion (DIC) as suggested by Spiegelhalter, Best, Carlin and Van Der Linde (2002).
A FLEXIBLE RANDOM EFFECTS DISTRIBUTION 89
The best fitting model is one with the smallest DIC value. In this analysis, the unstructured hetero-
geneity, modelled using the generalized Gaussian distribution was found to perform slightly better
than the other models considered in this study. This can be seen in Table 2, based on the DIC values.
Figure 1 shows the spatial distribution of TB in Kenya based on this best fitting model. This is a
map of relative risk and its corresponding credible interval.
Figure 1: TB relative risk map(a) and the corresponding 95% lower(b) and upper(c) credible limits
maps, respectively, produced by model 2.
6. Discussion
Routine framework for modelling correlated data is through the generalized linear mixed effects
model in which a random effect is incorporated. The usual main assumption in a standard version
of the setup is to model the between subject variations with random effects that are normally dis-
tributed. The assumption of modelling random effects with a normal distribution has been both
challenged and supported by several authors (McCulloch and Neuhaus, 2011; Litière, Alonso and
Molenberghs, 2007; Litière, Alonso and Molenberghs, 2008). A lot of work has been done in trying
to find better fitting distributions in the recent past. Magder and Zeger (1996) proposed a smooth
non-parametric maximum likelihood approach to modelling the random effects. Verbeke and Lesaf-
90 NGESA, ACHIA & MWAMBI
fre (1997) proposed using a mixture of normal distributions for the random effects and they carried
out their estimation using the expectation maximization (EM) algorithm. Zhang and Davidian (2001)
proposed a semi-parametric linear mixed model in which they assumed that the random effects have
a smooth density represented by semi-nonparametric truncated series expansion. Ho and Hu (2008)
used a finite mixture of normal in a Bayesian setting with the number of components being estimated
from the data automatically.
In disease mapping context, the same situation arises. The basic BYM model has two compo-
nents, one which is spatially structured and the other component which is spatially unstructured.
The spatially unstructured component is usually modelled using the normal distribution. In this
work we propose the generalized Gaussian distribution as a random effect distribution to replace
the over-restrictive normal distribution for the unstructured heterogeneity. The special cases of the
generalized Gaussian distribution, including the normal and Laplace distributions, are exposed. The
generalized Gaussian distribution has an extra parameter to allow for high and low peakedness as
dictated by the data.
The parameters in the models are estimated under Bayesian inference. The models were im-
plemented in WinBUGS. The generalized Gaussian distribution is not a standard distribution in the
WinBUGS software. We introduced this distribution in the software using zero tricks, see Appendix
B. The models were compared using simulation studies and again with a real data set.
In the simulation study it was seen that the effect of misspecification of the random effects
when the normal distribution is used in place of the generalized Gaussian distribution was high as
compared to using the generalized Gaussian in place of the normal distribution. The generalized
Gaussian distribution has all the nice properties of the normal distribution. In fact the normal distri-
bution is a special case of the generalized Gaussian distribution. When the random effect distribution
fails to adhere to the normality assumption due to peakedness, the generalized Gaussian distribution
plays a big role in capturing this, something that the normal distribution cannot.
In the real data sets comparison, the generalized Gaussian distribution is seen to perform better
than the normal distribution model. This model was used to produce county specific maps of relative
risk of TB in Kenya. The maps are critical in understanding disease epidemiology and also in helping
policy makers to develop informed intervention programs and allocate scarce resources adequately.
One limitation of this model is that it only captures high and low peakedness departures from
the normal distribution. It assumes that the random effects are symmetric. This assumption can at
times also be wrong. More flexible random effects models, which can also capture skewness can be
investigated.
91
Appendices
A Tuberculosis data
Table 3: Number of TB cases reported for each county and the corresponding population size.
County TB cases Population County TB cases Population
Baringo 572 461175 Mandera 993 286006
Bomet 729 437321 Marsabit 989 194960
Bungoma 1343 1134381 Meru 2380 1221068
Busia 1262 618068 Migori 1974 746904
Elgeyo Marakwet 395 326798 Mombasa 5889 755867
Embu 1145 497662 Muranga 1541 808488
Garissa 1000 480489 Nairobi 15979 2495170
Homa Bay 3159 829355 Nakuru 3413 1354899
Isiolo 611 107741 Nandi 720 659957
Kajiado 613 461174 Narok 610 604298
Kakamega 1979 1454722 Nyamira 763 548053
Kericho 1936 890544 Nyandarua 639 526742
Kiambu 2638 1523061 Nyeri 1536 722739
Kilifi 1521 923837 Samburu 340 163001
Kirinyaga 721 502243 Siaya 2034 790555
Kisii 2105 1052456 Taita Taveta 569 279951
Kisumu 4753 882705 Tana River 278 195965
Kitui 2166 908106 Tharaka-Nithi 1053 338616
Kwale 945 559901 Trans Nzoia 876 652005
Laikipia 344 365759 Turkana 1340 516833
Lamu 111 83985 Uasin Gishu 2384 707664
Machakos 2474 1005586 Vihiga 515 561538
Makueni 1119 856800 Wajir 743 377527
West Pokot 915 349857
likelihood of the prior distribution that we are interested in, the GGD, that is λ = −log(GGD(θ )).
This is the prior distribution that that we wanted. The corresponding code in Winbugs software is
given below.
References
B ESAG , J., YORK , J., AND M OLLIE , A. (1991). Bayesian image restoration with two applications
in spatial statistics (with discussion). Ann Inst Stat Math., 43, 1–59.
B EST, N., T HOMAS , A., WALLER , L., C ONLON , E., AND A RNOLD , R. (1999). Bayesian models
for spatially correlated disease and exposure data. In Bayesian Statistics 6: Proceedings of the
Sixth Valencia International Meeting, volume 6. Oxford University Press, USA, pp. 131–156.
B OX , G. AND T IAO , G. (1973). Bayesian inference in statistical analysis. Addison-Wesley Pub.
Co.
C LAYTON , D. AND B ERNARDINELLI , L. (1992). Bayesian methods for mapping disease risk: Geo-
graphical and environmental epidemiology, methods for small-area studies. Oxford University
Press, Oxford.
E VERITT, B. AND D UNN , G. (2011). Applied Multivariate Analysis. 2001. Arnold, London.
G ELMAN , A., C ARLIN , J., S TERN , H., AND RUBIN , D. (2003). Bayesian data analysis. Chapman
& Hall/CRC.
H O , R. K. AND H U , I. (2008). Flexible modelling of random effects in linear mixed modelsŮa
bayesian approach. Computational Statistics & Data Analysis, 52 (3), 1347–1361.
93
L ITIÈRE , S., A LONSO , A., AND M OLENBERGHS , G. (2007). Type i and type ii error under random-
effects misspecification in generalized linear mixed models. Biometrics, 63 (4), 1038–1044.
L ITIÈRE , S., A LONSO , A., AND M OLENBERGHS , G. (2008). The impact of a misspecified random-
effects distribution on the estimation and the performance of inferential procedures in general-
ized linear mixed models. Statistics in medicine, 27 (16), 3125–3144.
M AGDER , L. S. AND Z EGER , S. L. (1996). A smooth nonparametric estimate of a mixing distri-
bution using mixtures of gaussians. Journal of the American Statistical Association, 91 (435),
1141–1151.
M ANDA , S. M., F ELTBOWER , R. G., AND G ILTHORPE , M. S. (2011). Review and empirical
comparison of joint mapping of multiple diseases. Southern African Journal of Epidemiology
and Infection, 27 (4), 169–182.
M C C ULLOCH , C. E. AND N EUHAUS , J. M. (2011). Misspecifying the shape of a random effects
distribution: why getting it wrong may not matter. Statistical Science, 26 (3), 388–402.
NADARAJAH , S. (2005). A generalized normal distribution. Journal of Applied Statistics, 32 (7),
685–694.
N TZOUFRAS , I. (2011). Bayesian modeling using WinBUGS, volume 698. Wiley.
R EZAEIAN , M., D UNN , G., S T L EGER , S., AND A PPLEBY, L. (2007). Geographical epidemi-
ology, spatial analysis and geographical information systems: a multidisciplinary glossary. J
Epidemiol Commun H, 61 (2), 98–102.
S MANS , M. AND E STEVE , J. (1997). Pratical approaches to disease mapping: Geographical and
Environmental Epidemiology, Methods for Small area studies. Oxford University Press.
S PIEGELHALTER , D., B EST, N., C ARLIN , B., AND VAN D ER L INDE , A. (2002). Bayesian mea-
sures of model complexity and fit. Journal of the Royal Statistical Society: Series B (Statistical
Methodology), 64 (4), 583–639.
S PIEGELHALTER , D., T HOMAS , A., B EST, N., AND L UNN , D. (2007). Winbugs user manual
version 1.4 january 2003. upgraded to version 1.4.3.
V ERBEKE , G. AND L ESAFFRE , E. (1997). The effect of misspecifying the random-effects distribu-
tion in linear mixed models for longitudinal data. Computational Statistics & Data Analysis,
23 (4), 541–556.
WAKEFIELD , J. (2007). Disease mapping and spatial regression with count data. Biostatistics, 8 (2),
158–183.
WAKEFIELD , J., B EST, N., AND WALLER , L. (2000). Bayesian approaches to disease mapping,
Spatial epidemiology: methods and applications. Oxford: Oxford University Press.
Z HANG , D. AND DAVIDIAN , M. (2001). Linear mixed models with flexible distributions of random
effects for longitudinal data. Biometrics, 57 (3), 795–802.