0% found this document useful (0 votes)
3 views

PermutationTests

The document discusses permutation tests as a non-parametric method for comparing two samples, specifically in the context of assessing rainfall from seeded versus non-seeded clouds. It explains the methodology of permutation tests, including how to compute p-values and the significance of results, while also addressing the limitations of traditional t-tests when data is non-normal. The document provides examples and statistical formulas to illustrate the application of permutation tests in various scenarios.

Uploaded by

Ton Esteve
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
3 views

PermutationTests

The document discusses permutation tests as a non-parametric method for comparing two samples, specifically in the context of assessing rainfall from seeded versus non-seeded clouds. It explains the methodology of permutation tests, including how to compute p-values and the significance of results, while also addressing the limitations of traditional t-tests when data is non-normal. The document provides examples and statistical formulas to illustrate the application of permutation tests in various scenarios.

Uploaded by

Ton Esteve
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 29

Permutation tests

Alejandra Cabaña
An example

The amount of water (in liters) obtained from 26 seeded clouds


and 26 natural clouds in the same region/time has been recorded.

Do seeded clouds produce a significantly greater amount


of rain?
Amount of water from non-seeded clouds

0.08
Density

0.04
0.00

0 20 40 60 80 100 120 140

noseed

Amount of water from seeded clouds


0.08
Density

0.04
0.00

0 20 40 60 80 100 120 140

seed
Data are non-symetric... should we trust in t-test?
2 ) and Y ∼N(µ , σ 2 ), consider the statistic
If Xi ∼N(µX , σX i Y Y

(X̄ − Ȳ ) − (µX − µY )
T = q
2 /n + S 2 /n
SX X Y Y

nX nY n
X
X X
2 1 X
where X̄ = Xi , Ȳ = Yi S X = (Xi − X̄)2 and
nX − 1
i=1 i=1 i=1
Y n
1 X
SY2 = (Yj − Ȳ )2 .
nY − 1
j=1
2 = σ 2 then T ∼ t
If σX Y nX +nY −2 , but otherwise, the distribution
of T is unknown, and there are several classical approximations.
If the data are non-normal, then the test based on T and its
approximated distribution need not be precise 1 .

1
Efron, 1977
x → non-seeded cloud data y→ seeded cloud data

> t.test(x,y, alternative="less")


Welch Two Sample t-test
data: x and y
t = -1.9351, df = 33.858, p-value = 0.03068
alternative hypothesis:
true difference in means is less than 0
95 percent confidence interval:
-Inf -1.633874
sample estimates:
mean of x mean of y
8.200123 21.159112
We can try to symmetrize the data
Histogram of log(noseed) Histogram of log(seed)

0.30

0.30
Density

Density
0.15

0.15
0.00

0.00
-2 0 2 4 -2 0 2 4

log(x) log(y)

Normal Q-Q Plot Normal Q-Q Plot


0 1 2 3 4
Sample Quantiles

Sample Quantiles

4
2
0
-2

-2
-2 -1 0 1 2 -2 -1 0 1 2

Theoretical Quantiles Theoretical Quantiles


4
2
0
-2

log (noseed) log (seed)


t.test(log(x),log(y),alternative="less")
Welch Two Sample t-test
data: log(x) and log(y)
t = -2.2602, df = 49.994, p-value = 0.0141

alternative hypothesis: true difference in


means is less than 0
95 percent confidence interval:

-Inf -0.261692

sample estimates:
mean of x mean of y
1.066551 2.078881

The conclusion remains roughly the same...


The problem in the previous example is that

• the sample distribution is not known

• sample sizes are moderate

• The question is “broad” : are we interested on comparing


means? dispersions? distributions (FX = FY ) ?

A reasonable option in this case is to perform a permutation test.


Permutation tests were among the very first statistical tests to
be developed2 , but they were beyond the computing capacities
of the 1930s. Permutation test give a simple way to compute

the sampling distribution for any test statistic, when the


distribution is invariant under permutations under the null
hypothesis.

2
Pitman (1937/38 developed exact permutation methods consistent with
the Neyman–Pearson approach for the comparison of k-samples and for
bivariate correlation
Permutation tests are also known as conditional tests.

• In a non-parametric context, we condition on the observed data:


under the null hypothesis, all data have been obtained indepen-
dently from a unique distribution F . All permutations are equally
likely so we are performing a test conditioned on the observed va-
lues, without assuming anything about F .

• In a parametric context, we can condition on any sufficient


statistic (the sample, for instance).
To estimate the sampling distribution of the test statistic we
randomly shuffle the exposures to make up a large enough sample
of data sets.
X1 X2 X3 Y1 Y2 X̄ M(X) Ȳ =M(Y )
original 58.87 39.11 18.47 82.49 79.68 38.82 39.11 81.08
shuffle 1 39.11 82.49 79.68 58.87 18.47 67.09 79.68 38.67
shuffle 2 79.68 39.11 58.87 18.47 82.49 59.22 58.87 53.27
.. .. .. .. .. .. .. .. ..
. . . . . . . . .

If the null hypothesis were true, the shuffled data sets should
have all the same distribution, and thus statistics based on the
new samples should look like the ones computed form the original
data, otherwise they should look different.
• The permutation distribution of a test statistic T is obtained
by repeatedly rearranging the observations.

• With two or more samples, all the observations are combined


into a single large sample before rearranging them.

• There are no limitations upon the test statistic.


Obtaining p-values for the 2-sample
problem

Let N = m + n and Z1 , . . . , ZN be the combined sample obtained


by concatenating the X and Y samples.
1 From the joint sample, Z, select a random subset of size m.
Declare that the chosen elements belong to the X sample
and the remaining n to the Y sample.
2 Compute the statistic of interest for this artificial pair of
samples.
3 Repeat steps (1) and (2) a large number B of times, and
4 From the B values computed above, extract an approximate
p-value for the observed statistic (the one calculated with
the original X and Y samples)
Back to the seeded clouds example, if we test H0 : µx = µy
against H1 : µx < µy , we reject H0 if µx − µy < const.

muestra=c(x,y)
rep=999
original=mean(x)-mean(y)
distrib=numeric(rep)
l=length(x)+length(y)
for(i in 1:rep){
sam=sample(muestra,l)
newx=sam[1:length(x)]
newy=sam[(length(x)+1):l]
distrib[i]=mean(newx)-mean(newy)
}
pval=sum(c(original,distrib)<=original)/(rep+1);pval
[1] 0.029

The t-test was not misleading! But this approach is more sound.
Permutation distribution for differences in mean Permutation distribution for differences in median

250
250

200
200

150
Frequency

Frequency
150

100
100

50
50
0

0
−20 −10 0 10 20 −10 −5 0 5 10

distrib mediana
The permutations distribution

Before looking at more examples, let us see why this procedure


works: we observe two independent samples

X1 , . . . , Xn ∼ FX and Y1 , . . . , Ym ∼ FY

Let us call Z the combined sample

Z = {X1 , . . . , Xn , Y1 , . . . , Ym }

indexed by ν = {1, 2, . . . , n, n + 1, . . . , n + m} = {1, . . . , N } and


Z ∗ = (X ∗ , Y ∗ ) a permutation (re-shuffeling) of the original Z,
that is, if π is a permutation of the integers ν, Zi∗ = Z{π(i)},
and declare that the first n elements of Z ∗ correspond to the X ∗
sample ant the remaining m to Y ∗ .
N

The number of possible partitions is n
Under H0 : FX = FY , any randomly chosen Z ∗ has probability
1 n!m!
N
=
n
N!

that is, if FX = FY all permutations are equally likely.


If θ̂(X, Y ) = θ̂(Z, ν) is a statistic, then the distribution of θ̂∗ (that
is, the distribution of the replicates of the statistic) is
N
 −1 X
N
Fθ̂∗ (t) = P(θ̂∗ ≤ t) = 1{θ̂∗ ≤t}
n
j=1

If large values of θ̂ favour the alternative, we reject H0 whenever


θ̂ > c∗1−α , where c∗1−α is the 1 − α quantile of the distribution of
θ̂∗ . Similarly if the test is left-tailed or two-sided.
Computing the p−value of the test

If N is large (for the seeded clouds, N ! = 52! =8.065818 ×1067 )


it is excessive to consider all possible permutations, and we ap-
proximate the distribution and the p-values taking only B per-
mutations 3 :“at least 99 and at most 999 random permutations
should suffice”.
The p-value of the test can be approximated by
PB
1 + ]{θ̂(b) ≤ θ̂} 1+ b=1 1{θ̂(b) ≤θ̂}
p̂ = =
B+1 B+1
(sum(distrib<original)+1)/1000
[1] 0.029

Confidence interval for the true p: p ∈ (0.02891589, 0.02902617)

3
Davison and Hinkley (1997), Bootstrap methods and their applications,
p.159, CUP
Tests for H0 : F = G
From independent samples

X1 , . . . , Xn ∼ F Y1 , . . . , Ym ∼ G

the general null hypothesis H0 : F = G


can be tested using Kolmogorov-Smirnov statistics,

D = sup |Fn (zi ) − Fm (zi )|


1≤i≤N

rejecting H0 for large vaules of D, and deciding what “large”means


by means of permutations.
Alternatively, the Cramér-von-Mises statistics can be used
 
n m
mn  X X
W = (Fn (xi ) − Gm (xi )2 + (Fn (yj ) − Gm (yj )2 
(m + n)2
i=1 j=1
Kolmogorov−Smirnov test for F=G

250
200
150
Frequency

100
50
0

0.1 0.2 0.3 0.4 0.5 0.6

distribks

In this case, the p-value is 0.083, which is not particularly low.


The conclusion now is that the distributions are not different.
Another example

A diver has agreed with the boat crew that, in case of danger, he
will transmit a binary message ...01010101010101...
When the diver does not transmit, the boat receives background
noise
. . . , yi , yi+1 , . . .
where yi are independent Bernoulli r.v.’s with unknown parame-
ter p . A particular day the boat receives the message

000101100101010100010101
Is this background noise or a cry for help?
A statistic for the null hypothesis
H0 : “ the message is noise”
is
T (y) = max(|y − I1 |, |y − I2 |)
where y is the vector that contains the message, I1 and I2 are
sequences (of the same length of the signal) with alternating 01
in I1 and 10 in I2 . P
The total number of ‘ones”, S(y) = yi is a sufficient statistic
for p. So, conditional on S(y) = 10, the vector

Y |{S(y) = 10}

is a random rearrangement of 10 ones and 14 zeroes, and we can


do a permutation test:
> x=c(0,0,0,1,0,1,1,0,0,1,0,1,0,1,0,1,0,0,0,1,0,1,0,1)
> length(x); sum(x)
[1] 24
[1] 10
> Y1=rep(c(0,1),12)
> Y2=rep(c(1,0),12)
> tobs=max(sum(abs(x-Y1)),sum(abs(x-Y2)));tobs
[1] 20
> permu=function(){
+ newy=sample(x)
+ t=max(sum(abs(newy-Y1)),sum(abs(newy-Y2)))
+ return(t)
+ }
> R=100000
> t=replicate(R,permu())
> unique(t)
[1] 14 16 12 18 22 20
> pval=sum(t==20)/R;pval ## The message was not noise
[1] 0.00254
Further examples

Suppose we have pairs

(X1 , Y1 ), . . . , (Xn , Yn )

• Paired samples If the null is that X and Y are


interchangeable, once the test statistic is chosen, we
interchange X and Y for each pair at random.

• Linear models, ANOVAs Imagine we want to test whether


X and Y are uncorrelated. Under H0 , we can re-shuffle the
Y s, and would have equivalent pairs. Then choose an
adequate statistic (Pearson’s ,Kendall’s...)
Tests para H0 : F = G
A partir de muestras independientes

X1 , . . . , Xn ∼ F Y1 , . . . , Ym ∼ G

queremos probar la hipótesis nula H0 : F = G


Podemos basar nuestra decisión en el estadı́stico de Kolmogorov-
Smirnov
D = sup |Fn (zi ) − Fm (zi )|
1≤i≤N

rechazando H0 para valores grandes de D, y usando un test de


permutaciones.
También se puede hacer un test de tipo Cramér-von-Mises, con
estadı́stico
 
n m
mn X X
W = (Fn (xi ) − Gm (xi )2 + (Fn (yj ) − Gm (yj )2 
(m + n)2
i=1 j=1

Ver ejemplo chickwts en Permutaciones.R


Un test por pares

Supongamos que tenemos n pares de observaciones correspondien-


tes a un experimento pareado:

(X1 , Y1 ), . . . , (Xn , Yn )
La hipótesis nula es que X e Y son intercambiables en cada par.
• Una vez elegido el estadı́stico adecuado para el test,
cambiaremos aleatoriamente los pares:
• Para cada par, lanzamos una moneda justa, si sale cara
dejamos los pares como están, si no, intercambiamos X con
Y.
• Comparamos el valor del estadı́stico original, con la
distribución obtenida con las permutaciones.
Ver ejemplo shoes en Permutaciones.R
Test de correlación
Supongamos ahora que tenemos n pares de observaciones

(X1 , Y1 ), . . . , (Xn , Yn )
Queremos probar la hipótesis nula H0 de que X e Y no están
correlacionadas.
Bajo H0 , podemos intercambiar las Xi entre ellas, y tendrı́amos
pares equivalentes.
Para hacer un test de permutaciones, elegimos un estadı́stico: el
coeficiente de correlación ρ de Pearson, si buscamos correlaciones
lineales, o la τ de Kendall, etc.
• Permutamos aleatoriamente las Y dejando las X fijas.
• Comparamos el estadı́stico observado en la muestra original
con la distribución obtenida con las permutaciones.
Ver ejemplo aire en Permutaciones.R
Modelos Lineales y Análisis de la Varianza
Si los residuos en un modelo lineal o un ANOVA no son norma-
les, podrı́amos estar haciendo inferencias equivocadas sobre los
parámetros del modelo y sus posibles interacciones.
Una manera de evitar esos problemas es hacer tests de permu-
taciones. Si las respuestas no obedecieran el modelo, podrı́amos
permutarlas y obtendrı́amos los mismos resultados. Esto nos per-
mite incluso hacer los F -tests individuales para cada coeficiente.

El paquete lmPerm tiene instrucciones que permiten hacer los


tests (exactos o aproximados), si tener que preocuparnos de tener
que hacer las permutaciones entre las celdas.
Las instrucciones son simplemente lmp() y aovp().
También existe glmPerm para modelos lineales generalizados.
Ver los ejemplos lizard (ANOVA) y challenger (regresión
logı́stica) en Permutaciones.R
Series temporales

Supongamos que tenemos n observaciones de una serie temporal,


X1 , ...Xn .
Algunos tests relativos a la serie son fáciles de hacer con permu-
taciones. Por ejemplo, un test para
H0 :“La serie es un ruido blanco”, es decir,

H0 : Xi = µ + Zi , Zi iid con EXi = 0

o para encontrar el posible orden de un modelo AR(p).


Basta con que sea razonable suponer que bajo H0 las observaciones
son intercambiables, elegir el test adecuado (Ljung-Box, el valor
del acf, Durbin-Watson, etc), y proceder como siempre: encontrar
la distribución del estadı́stico bajo las permutaciones y comparar
el original con la distribución permutada.

You might also like