0% found this document useful (0 votes)

3 views

PermutationTests

The document discusses permutation tests as a non-parametric method for comparing two samples, specifically in the context of assessing rainfall from seeded versus non-seeded clouds. It explains the methodology of permutation tests, including how to compute p-values and the significance of results, while also addressing the limitations of traditional t-tests when data is non-normal. The document provides examples and statistical formulas to illustrate the application of permutation tests in various scenarios.

Uploaded by

Ton Esteve

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

3 views

PermutationTests

Uploaded by

Ton Esteve

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 29

Permutation tests

Alejandra Cabaña
An example

The amount of water (in liters) obtained from 26 seeded clouds

and 26 natural clouds in the same region/time has been recorded.

Do seeded clouds produce a significantly greater amount

of rain?
Amount of water from non-seeded clouds

0.08
Density

0.04
0.00

0 20 40 60 80 100 120 140

noseed

Amount of water from seeded clouds

0.08
Density

0.04
0.00

0 20 40 60 80 100 120 140

seed
Data are non-symetric... should we trust in t-test?
2 ) and Y ∼N(µ , σ 2 ), consider the statistic
If Xi ∼N(µX , σX i Y Y

(X̄ − Ȳ ) − (µX − µY )
T = q
2 /n + S 2 /n
SX X Y Y

nX nY n
X
X X
2 1 X
where X̄ = Xi , Ȳ = Yi S X = (Xi − X̄)2 and
nX − 1
i=1 i=1 i=1
Y n
1 X
SY2 = (Yj − Ȳ )2 .
nY − 1
j=1
2 = σ 2 then T ∼ t
If σX Y nX +nY −2 , but otherwise, the distribution
of T is unknown, and there are several classical approximations.
If the data are non-normal, then the test based on T and its
approximated distribution need not be precise 1 .

1
Efron, 1977
x → non-seeded cloud data y→ seeded cloud data

> t.test(x,y, alternative="less")

Welch Two Sample t-test
data: x and y
t = -1.9351, df = 33.858, p-value = 0.03068
alternative hypothesis:
true difference in means is less than 0
95 percent confidence interval:
-Inf -1.633874
sample estimates:
mean of x mean of y
8.200123 21.159112
We can try to symmetrize the data
Histogram of log(noseed) Histogram of log(seed)

0.30

0.30
Density

Density
0.15

0.15
0.00

0.00
-2 0 2 4 -2 0 2 4

log(x) log(y)

Normal Q-Q Plot Normal Q-Q Plot

0 1 2 3 4
Sample Quantiles

Sample Quantiles

4
2
0
-2

-2
-2 -1 0 1 2 -2 -1 0 1 2

Theoretical Quantiles Theoretical Quantiles

4
2
0
-2

log (noseed) log (seed)

t.test(log(x),log(y),alternative="less")
Welch Two Sample t-test
data: log(x) and log(y)
t = -2.2602, df = 49.994, p-value = 0.0141

alternative hypothesis: true difference in

means is less than 0
95 percent confidence interval:

-Inf -0.261692

sample estimates:
mean of x mean of y
1.066551 2.078881

The conclusion remains roughly the same...

The problem in the previous example is that

• the sample distribution is not known

• sample sizes are moderate

• The question is “broad” : are we interested on comparing

means? dispersions? distributions (FX = FY ) ?

A reasonable option in this case is to perform a permutation test.

Permutation tests were among the very first statistical tests to
be developed2 , but they were beyond the computing capacities
of the 1930s. Permutation test give a simple way to compute

the sampling distribution for any test statistic, when the

distribution is invariant under permutations under the null
hypothesis.

2
Pitman (1937/38 developed exact permutation methods consistent with
the Neyman–Pearson approach for the comparison of k-samples and for
bivariate correlation
Permutation tests are also known as conditional tests.

• In a non-parametric context, we condition on the observed data:

under the null hypothesis, all data have been obtained indepen-
dently from a unique distribution F . All permutations are equally
likely so we are performing a test conditioned on the observed va-
lues, without assuming anything about F .

• In a parametric context, we can condition on any sufficient

statistic (the sample, for instance).
To estimate the sampling distribution of the test statistic we
randomly shuffle the exposures to make up a large enough sample
of data sets.
X1 X2 X3 Y1 Y2 X̄ M(X) Ȳ =M(Y )
original 58.87 39.11 18.47 82.49 79.68 38.82 39.11 81.08
shuffle 1 39.11 82.49 79.68 58.87 18.47 67.09 79.68 38.67
shuffle 2 79.68 39.11 58.87 18.47 82.49 59.22 58.87 53.27
.. .. .. .. .. .. .. .. ..
. . . . . . . . .

If the null hypothesis were true, the shuffled data sets should
have all the same distribution, and thus statistics based on the
new samples should look like the ones computed form the original
data, otherwise they should look different.
• The permutation distribution of a test statistic T is obtained
by repeatedly rearranging the observations.

• With two or more samples, all the observations are combined

into a single large sample before rearranging them.

• There are no limitations upon the test statistic.

Obtaining p-values for the 2-sample
problem

Let N = m + n and Z1 , . . . , ZN be the combined sample obtained

by concatenating the X and Y samples.
1 From the joint sample, Z, select a random subset of size m.
Declare that the chosen elements belong to the X sample
and the remaining n to the Y sample.
2 Compute the statistic of interest for this artificial pair of
samples.
3 Repeat steps (1) and (2) a large number B of times, and
4 From the B values computed above, extract an approximate
p-value for the observed statistic (the one calculated with
the original X and Y samples)
Back to the seeded clouds example, if we test H0 : µx = µy
against H1 : µx < µy , we reject H0 if µx − µy < const.

muestra=c(x,y)
rep=999
original=mean(x)-mean(y)
distrib=numeric(rep)
l=length(x)+length(y)
for(i in 1:rep){
sam=sample(muestra,l)
newx=sam[1:length(x)]
newy=sam[(length(x)+1):l]
distrib[i]=mean(newx)-mean(newy)
}
pval=sum(c(original,distrib)<=original)/(rep+1);pval
[1] 0.029

The t-test was not misleading! But this approach is more sound.
Permutation distribution for differences in mean Permutation distribution for differences in median

250
250

200
200

150
Frequency

Frequency
150

100
100

50
50
0

0
−20 −10 0 10 20 −10 −5 0 5 10

distrib mediana
The permutations distribution

Before looking at more examples, let us see why this procedure

works: we observe two independent samples

X1 , . . . , Xn ∼ FX and Y1 , . . . , Ym ∼ FY

Let us call Z the combined sample

Z = {X1 , . . . , Xn , Y1 , . . . , Ym }

indexed by ν = {1, 2, . . . , n, n + 1, . . . , n + m} = {1, . . . , N } and

Z ∗ = (X ∗ , Y ∗ ) a permutation (re-shuffeling) of the original Z,
that is, if π is a permutation of the integers ν, Zi∗ = Z{π(i)},
and declare that the first n elements of Z ∗ correspond to the X ∗
sample ant the remaining m to Y ∗ .
N

The number of possible partitions is n
Under H0 : FX = FY , any randomly chosen Z ∗ has probability
1 n!m!
N
=
n
N!

that is, if FX = FY all permutations are equally likely.

If θ̂(X, Y ) = θ̂(Z, ν) is a statistic, then the distribution of θ̂∗ (that
is, the distribution of the replicates of the statistic) is
N
−1 X
N
Fθ̂∗ (t) = P(θ̂∗ ≤ t) = 1{θ̂∗ ≤t}
n
j=1

If large values of θ̂ favour the alternative, we reject H0 whenever

θ̂ > c∗1−α , where c∗1−α is the 1 − α quantile of the distribution of
θ̂∗ . Similarly if the test is left-tailed or two-sided.
Computing the p−value of the test

If N is large (for the seeded clouds, N ! = 52! =8.065818 ×1067 )

it is excessive to consider all possible permutations, and we ap-
proximate the distribution and the p-values taking only B per-
mutations 3 :“at least 99 and at most 999 random permutations
should suffice”.
The p-value of the test can be approximated by
PB
1 + ]{θ̂(b) ≤ θ̂} 1+ b=1 1{θ̂(b) ≤θ̂}
p̂ = =
B+1 B+1
(sum(distrib<original)+1)/1000
[1] 0.029

Confidence interval for the true p: p ∈ (0.02891589, 0.02902617)

3
Davison and Hinkley (1997), Bootstrap methods and their applications,
p.159, CUP
Tests for H0 : F = G
From independent samples

X1 , . . . , Xn ∼ F Y1 , . . . , Ym ∼ G

the general null hypothesis H0 : F = G

can be tested using Kolmogorov-Smirnov statistics,

D = sup |Fn (zi ) − Fm (zi )|

1≤i≤N

rejecting H0 for large vaules of D, and deciding what “large”means

by means of permutations.
Alternatively, the Cramér-von-Mises statistics can be used
 
n m
mn  X X
W = (Fn (xi ) − Gm (xi )2 + (Fn (yj ) − Gm (yj )2 
(m + n)2
i=1 j=1
Kolmogorov−Smirnov test for F=G

250
200
150
Frequency

100
50
0

0.1 0.2 0.3 0.4 0.5 0.6

distribks

In this case, the p-value is 0.083, which is not particularly low.

The conclusion now is that the distributions are not different.
Another example

A diver has agreed with the boat crew that, in case of danger, he
will transmit a binary message ...01010101010101...
When the diver does not transmit, the boat receives background
noise
. . . , yi , yi+1 , . . .
where yi are independent Bernoulli r.v.’s with unknown parame-
ter p . A particular day the boat receives the message

000101100101010100010101
Is this background noise or a cry for help?
A statistic for the null hypothesis
H0 : “ the message is noise”
is
T (y) = max(|y − I1 |, |y − I2 |)
where y is the vector that contains the message, I1 and I2 are
sequences (of the same length of the signal) with alternating 01
in I1 and 10 in I2 . P
The total number of ‘ones”, S(y) = yi is a sufficient statistic
for p. So, conditional on S(y) = 10, the vector

Y |{S(y) = 10}

is a random rearrangement of 10 ones and 14 zeroes, and we can

do a permutation test:
> x=c(0,0,0,1,0,1,1,0,0,1,0,1,0,1,0,1,0,0,0,1,0,1,0,1)
> length(x); sum(x)
[1] 24
[1] 10
> Y1=rep(c(0,1),12)
> Y2=rep(c(1,0),12)
> tobs=max(sum(abs(x-Y1)),sum(abs(x-Y2)));tobs
[1] 20
> permu=function(){
+ newy=sample(x)
+ t=max(sum(abs(newy-Y1)),sum(abs(newy-Y2)))
+ return(t)
+ }
> R=100000
> t=replicate(R,permu())
> unique(t)
[1] 14 16 12 18 22 20
> pval=sum(t==20)/R;pval ## The message was not noise
[1] 0.00254
Further examples

Suppose we have pairs

(X1 , Y1 ), . . . , (Xn , Yn )

• Paired samples If the null is that X and Y are

interchangeable, once the test statistic is chosen, we
interchange X and Y for each pair at random.

• Linear models, ANOVAs Imagine we want to test whether

X and Y are uncorrelated. Under H0 , we can re-shuffle the
Y s, and would have equivalent pairs. Then choose an
adequate statistic (Pearson’s ,Kendall’s...)
Tests para H0 : F = G
A partir de muestras independientes

X1 , . . . , Xn ∼ F Y1 , . . . , Ym ∼ G

queremos probar la hipótesis nula H0 : F = G

Podemos basar nuestra decisión en el estadı́stico de Kolmogorov-
Smirnov
D = sup |Fn (zi ) − Fm (zi )|
1≤i≤N

rechazando H0 para valores grandes de D, y usando un test de

permutaciones.
También se puede hacer un test de tipo Cramér-von-Mises, con
estadı́stico
 
n m
mn X X
W = (Fn (xi ) − Gm (xi )2 + (Fn (yj ) − Gm (yj )2 
(m + n)2
i=1 j=1

Ver ejemplo chickwts en Permutaciones.R

Un test por pares

Supongamos que tenemos n pares de observaciones correspondien-

tes a un experimento pareado:

(X1 , Y1 ), . . . , (Xn , Yn )
La hipótesis nula es que X e Y son intercambiables en cada par.
• Una vez elegido el estadı́stico adecuado para el test,
cambiaremos aleatoriamente los pares:
• Para cada par, lanzamos una moneda justa, si sale cara
dejamos los pares como están, si no, intercambiamos X con
Y.
• Comparamos el valor del estadı́stico original, con la
distribución obtenida con las permutaciones.
Ver ejemplo shoes en Permutaciones.R
Test de correlación
Supongamos ahora que tenemos n pares de observaciones

(X1 , Y1 ), . . . , (Xn , Yn )
Queremos probar la hipótesis nula H0 de que X e Y no están
correlacionadas.
Bajo H0 , podemos intercambiar las Xi entre ellas, y tendrı́amos
pares equivalentes.
Para hacer un test de permutaciones, elegimos un estadı́stico: el
coeficiente de correlación ρ de Pearson, si buscamos correlaciones
lineales, o la τ de Kendall, etc.
• Permutamos aleatoriamente las Y dejando las X fijas.
• Comparamos el estadı́stico observado en la muestra original
con la distribución obtenida con las permutaciones.
Ver ejemplo aire en Permutaciones.R
Modelos Lineales y Análisis de la Varianza
Si los residuos en un modelo lineal o un ANOVA no son norma-
les, podrı́amos estar haciendo inferencias equivocadas sobre los
parámetros del modelo y sus posibles interacciones.
Una manera de evitar esos problemas es hacer tests de permu-
taciones. Si las respuestas no obedecieran el modelo, podrı́amos
permutarlas y obtendrı́amos los mismos resultados. Esto nos per-
mite incluso hacer los F -tests individuales para cada coeficiente.

El paquete lmPerm tiene instrucciones que permiten hacer los

tests (exactos o aproximados), si tener que preocuparnos de tener
que hacer las permutaciones entre las celdas.
Las instrucciones son simplemente lmp() y aovp().
También existe glmPerm para modelos lineales generalizados.
Ver los ejemplos lizard (ANOVA) y challenger (regresión
logı́stica) en Permutaciones.R
Series temporales

Supongamos que tenemos n observaciones de una serie temporal,

X1 , ...Xn .
Algunos tests relativos a la serie son fáciles de hacer con permu-
taciones. Por ejemplo, un test para
H0 :“La serie es un ruido blanco”, es decir,

H0 : Xi = µ + Zi , Zi iid con EXi = 0

o para encontrar el posible orden de un modelo AR(p).

Basta con que sea razonable suponer que bajo H0 las observaciones
son intercambiables, elegir el test adecuado (Ljung-Box, el valor
del acf, Durbin-Watson, etc), y proceder como siempre: encontrar
la distribución del estadı́stico bajo las permutaciones y comparar
el original con la distribución permutada.

Mood - Graybill - Boes (1974) Introduction To The Theory of Statistics PDF
75% (8)
Mood - Graybill - Boes (1974) Introduction To The Theory of Statistics PDF
577 pages
CHAPTER 10 Extra
No ratings yet
CHAPTER 10 Extra
65 pages
Hardware Installation PDF
100% (1)
Hardware Installation PDF
462 pages
Chap8
No ratings yet
Chap8
10 pages
549 CHPT 2
No ratings yet
549 CHPT 2
166 pages
review
No ratings yet
review
81 pages
Lec3_permutation
No ratings yet
Lec3_permutation
8 pages
Testing The Statistiscal Independence of Continuous Random Variables. A New Robust Algorithm
No ratings yet
Testing The Statistiscal Independence of Continuous Random Variables. A New Robust Algorithm
5 pages
Mood - Graybill - Boes (1974) Introduction To The Theory of Statistics
100% (1)
Mood - Graybill - Boes (1974) Introduction To The Theory of Statistics
577 pages
1) Alexander McFarlane Mood, Franklin A. Graybill, Duane C. Boes - Introduction To The Theory of Statistics
No ratings yet
1) Alexander McFarlane Mood, Franklin A. Graybill, Duane C. Boes - Introduction To The Theory of Statistics
577 pages
MANN
No ratings yet
MANN
16 pages
Permutation, Parametric and Bootstrap Tests of Hypotheses_ a -- Phillip I_ Good -- Springer Series in Statistics, 3, 2005 -- Springer -- 9780387202792 -- d262ab622bdcefc17b4fc4e78c97a3f0 -- Anna’s Archive
No ratings yet
Permutation, Parametric and Bootstrap Tests of Hypotheses_ a -- Phillip I_ Good -- Springer Series in Statistics, 3, 2005 -- Springer -- 9780387202792 -- d262ab622bdcefc17b4fc4e78c97a3f0 -- Anna’s Archive
330 pages
Permutation, Parametric, and Bootstrap Tests of Hypotheses, 3rd Edition Updated Edition Download
100% (8)
Permutation, Parametric, and Bootstrap Tests of Hypotheses, 3rd Edition Updated Edition Download
16 pages
1 Descriptive Statistics
No ratings yet
1 Descriptive Statistics
20 pages
Permutation Tests - Final
No ratings yet
Permutation Tests - Final
19 pages
Unit 4
No ratings yet
Unit 4
4 pages
Permutation, Parametric, and Bootstrap Tests of Hypotheses - 3rd Edition EPUB DOCX PDF Download
No ratings yet
Permutation, Parametric, and Bootstrap Tests of Hypotheses - 3rd Edition EPUB DOCX PDF Download
17 pages
02 Permutation Test
No ratings yet
02 Permutation Test
15 pages
Notes10 Two-Sample Tests
No ratings yet
Notes10 Two-Sample Tests
73 pages
Nonparametric Statistics and Model Selection: 5.1 Estimating Distributions and Distribution-Free Tests
No ratings yet
Nonparametric Statistics and Model Selection: 5.1 Estimating Distributions and Distribution-Free Tests
10 pages
Stats 1 Formulae
No ratings yet
Stats 1 Formulae
26 pages
6 - Script - HYP SMALL200320111103030202
No ratings yet
6 - Script - HYP SMALL200320111103030202
10 pages
An Introduction to Biostatistic 3rd Edition Thomas Glover download
100% (1)
An Introduction to Biostatistic 3rd Edition Thomas Glover download
74 pages
Common Statistics
No ratings yet
Common Statistics
23 pages
A Primer of Permutation Statistical Methods All Chapters Included
100% (9)
A Primer of Permutation Statistical Methods All Chapters Included
17 pages
DEFINITIONS AND FORMULAE WITH STATISTICAL TABLES FOR ELEMENTARY STATISTICS AND QUANTITATIVE METHODS COURSES
No ratings yet
DEFINITIONS AND FORMULAE WITH STATISTICAL TABLES FOR ELEMENTARY STATISTICS AND QUANTITATIVE METHODS COURSES
13 pages
Regression Analysis
No ratings yet
Regression Analysis
68 pages
An Introduction to Biostatistic 3rd Edition Thomas Glover - The full ebook set is available with all chapters for download
No ratings yet
An Introduction to Biostatistic 3rd Edition Thomas Glover - The full ebook set is available with all chapters for download
75 pages
2502.07672v1
No ratings yet
2502.07672v1
69 pages
Young 1941
No ratings yet
Young 1941
9 pages
Combining Paired and Two-Sample Data Using A Permutation Test
No ratings yet
Combining Paired and Two-Sample Data Using A Permutation Test
13 pages
Kolmogorov Smirnov
100% (1)
Kolmogorov Smirnov
12 pages
Topic 3. Single Factor ANOVA: Introduction (ST&D Chapter 7)
No ratings yet
Topic 3. Single Factor ANOVA: Introduction (ST&D Chapter 7)
22 pages
Sampling Distributions and Inference: October 15, 2016
No ratings yet
Sampling Distributions and Inference: October 15, 2016
19 pages
Lecture 12 - Nonparametric Hypothesis Tests
No ratings yet
Lecture 12 - Nonparametric Hypothesis Tests
37 pages
Where can buy (Ebook) An Introduction to Biostatistic by Thomas Glover, Kevin Mitchell ISBN 9781478627791, 1478627794 ebook with cheap price
100% (1)
Where can buy (Ebook) An Introduction to Biostatistic by Thomas Glover, Kevin Mitchell ISBN 9781478627791, 1478627794 ebook with cheap price
81 pages
KLS Gogte Institute of Technology, Belagavi.: "F Distribution"
No ratings yet
KLS Gogte Institute of Technology, Belagavi.: "F Distribution"
15 pages
9.1. Prob - Stats
No ratings yet
9.1. Prob - Stats
19 pages
Mood Introduction To The Theory of Statistics
0% (1)
Mood Introduction To The Theory of Statistics
577 pages
Introduction To Theory of Statistics, A. Mood, F. Graybill and B. Boes, McGrow-Hill
100% (1)
Introduction To Theory of Statistics, A. Mood, F. Graybill and B. Boes, McGrow-Hill
578 pages
Mood A.m., Graybill F.a., Boes D.C. Introduction To The Theory of Statistics (3rd Ed., McGraw-Hil - 0
No ratings yet
Mood A.m., Graybill F.a., Boes D.C. Introduction To The Theory of Statistics (3rd Ed., McGraw-Hil - 0
577 pages
Small Sampling Theory Presentation
No ratings yet
Small Sampling Theory Presentation
23 pages
2.1 Descriptive Statistics Contd
No ratings yet
2.1 Descriptive Statistics Contd
20 pages
Samenvatting Statistiek 10tm17
No ratings yet
Samenvatting Statistiek 10tm17
11 pages
Statistics07_TwoSamplesHypothesisTest
No ratings yet
Statistics07_TwoSamplesHypothesisTest
45 pages
Kolmogorov-Smirnov Test
No ratings yet
Kolmogorov-Smirnov Test
10 pages
Week 4
No ratings yet
Week 4
36 pages
F Test
100% (1)
F Test
7 pages
Chapter2 handout
No ratings yet
Chapter2 handout
34 pages
Practical On Nonparametric Statistical Tests
No ratings yet
Practical On Nonparametric Statistical Tests
16 pages
1 Tests of The Equality of Two Means
No ratings yet
1 Tests of The Equality of Two Means
8 pages
stats_final_review
No ratings yet
stats_final_review
11 pages
1.5.1 GLMs ANOVA CRDs (Hale) - Supp Reading
No ratings yet
1.5.1 GLMs ANOVA CRDs (Hale) - Supp Reading
14 pages
1 Biostatistics
No ratings yet
1 Biostatistics
16 pages
BIOL 2163 Lecture 11 - Inferences From Two Samples - Continued
No ratings yet
BIOL 2163 Lecture 11 - Inferences From Two Samples - Continued
48 pages
STAT22209 - Nonparametric Statistics
No ratings yet
STAT22209 - Nonparametric Statistics
74 pages
Trend Analysis-1
No ratings yet
Trend Analysis-1
45 pages
Hypothesis Test - Variance - Section B
No ratings yet
Hypothesis Test - Variance - Section B
40 pages
Unit 4 & Unit 5
0% (1)
Unit 4 & Unit 5
59 pages
Chapter 12
No ratings yet
Chapter 12
26 pages
Student Solutions Manual to Accompany Economic Dynamics in Discrete Time, secondedition
From Everand
Student Solutions Manual to Accompany Economic Dynamics in Discrete Time, secondedition
Yue Jiang
4.5/5 (2)
Filters
100% (2)
Filters
7 pages
AQA 43602H QP Nov11
No ratings yet
AQA 43602H QP Nov11
12 pages
11 Cbse Chemistry Organic Chemistry
No ratings yet
11 Cbse Chemistry Organic Chemistry
22 pages
Summative Test in Special Products
100% (2)
Summative Test in Special Products
3 pages
Unit 5 New PDF
No ratings yet
Unit 5 New PDF
52 pages
Python Stage 1 Script
No ratings yet
Python Stage 1 Script
2 pages
Basic Electronics Urdu Book PDF
No ratings yet
Basic Electronics Urdu Book PDF
27 pages
Astm 20M PDF
No ratings yet
Astm 20M PDF
33 pages
Theoretical Aspects of Hydrologic Optics: Section 1
No ratings yet
Theoretical Aspects of Hydrologic Optics: Section 1
21 pages
Math IA Second Draft
No ratings yet
Math IA Second Draft
18 pages
Density
No ratings yet
Density
18 pages
Arora 2016
No ratings yet
Arora 2016
4 pages
Parker O-Ring Handbook (Modified)
No ratings yet
Parker O-Ring Handbook (Modified)
83 pages
A 0.6Ghz To 2Ghz Digital PLL With Wide Tracking Range
No ratings yet
A 0.6Ghz To 2Ghz Digital PLL With Wide Tracking Range
4 pages
Bally EVO Hybrid Setup Step by Step
100% (1)
Bally EVO Hybrid Setup Step by Step
3 pages
LC377
No ratings yet
LC377
66 pages
Mathematics FK (1)
No ratings yet
Mathematics FK (1)
24 pages
2exp Vol Spheres Cones Pyramids 2
No ratings yet
2exp Vol Spheres Cones Pyramids 2
3 pages
Chemistry_worksheet_2_Matter_in_Our_surroundings
No ratings yet
Chemistry_worksheet_2_Matter_in_Our_surroundings
2 pages
en Magneto Resistive Proximity Sensor SMTO
No ratings yet
en Magneto Resistive Proximity Sensor SMTO
2 pages
Projectile - Revision Sheet
No ratings yet
Projectile - Revision Sheet
3 pages
Lec 9 Antihuman Globulin Testing
100% (1)
Lec 9 Antihuman Globulin Testing
9 pages
Graphical Solution of Linear Programming Models
No ratings yet
Graphical Solution of Linear Programming Models
44 pages
Practice Questions of Pointers and Strings: Pointer
No ratings yet
Practice Questions of Pointers and Strings: Pointer
2 pages
1 s2.0 S245232161930530X Main
No ratings yet
1 s2.0 S245232161930530X Main
10 pages
Final Test - Pre-Intemediate
No ratings yet
Final Test - Pre-Intemediate
7 pages
Cutting Performance of Glass-Vinyl Ester Composite by Abrasive Water Jet
No ratings yet
Cutting Performance of Glass-Vinyl Ester Composite by Abrasive Water Jet
34 pages
Systemd Vs SysVinit
No ratings yet
Systemd Vs SysVinit
1 page
Applications of Integration 2
No ratings yet
Applications of Integration 2
27 pages

PermutationTests

Uploaded by

PermutationTests

Uploaded by

Permutation tests

The amount of water (in liters) obtained from 26 seeded clouds

Do seeded clouds produce a significantly greater amount

0 20 40 60 80 100 120 140

Amount of water from seeded clouds

0 20 40 60 80 100 120 140

> t.test(x,y, alternative="less")

Normal Q-Q Plot Normal Q-Q Plot

Theoretical Quantiles Theoretical Quantiles

log (noseed) log (seed)

alternative hypothesis: true difference in

The conclusion remains roughly the same...

• the sample distribution is not known

• sample sizes are moderate

• The question is “broad” : are we interested on comparing

A reasonable option in this case is to perform a permutation test.

the sampling distribution for any test statistic, when the

• In a non-parametric context, we condition on the observed data:

• In a parametric context, we can condition on any sufficient

• With two or more samples, all the observations are combined

• There are no limitations upon the test statistic.

Let N = m + n and Z1 , . . . , ZN be the combined sample obtained

Before looking at more examples, let us see why this procedure

Let us call Z the combined sample

indexed by ν = {1, 2, . . . , n, n + 1, . . . , n + m} = {1, . . . , N } and

that is, if FX = FY all permutations are equally likely.

If large values of θ̂ favour the alternative, we reject H0 whenever

If N is large (for the seeded clouds, N ! = 52! =8.065818 ×1067 )

Confidence interval for the true p: p ∈ (0.02891589, 0.02902617)

the general null hypothesis H0 : F = G

D = sup |Fn (zi ) − Fm (zi )|

rejecting H0 for large vaules of D, and deciding what “large”means

0.1 0.2 0.3 0.4 0.5 0.6

In this case, the p-value is 0.083, which is not particularly low.

is a random rearrangement of 10 ones and 14 zeroes, and we can

Suppose we have pairs

• Paired samples If the null is that X and Y are

• Linear models, ANOVAs Imagine we want to test whether

queremos probar la hipótesis nula H0 : F = G

rechazando H0 para valores grandes de D, y usando un test de

Ver ejemplo chickwts en Permutaciones.R

Supongamos que tenemos n pares de observaciones correspondien-

El paquete lmPerm tiene instrucciones que permiten hacer los

Supongamos que tenemos n observaciones de una serie temporal,

H0 : Xi = µ + Zi , Zi iid con EXi = 0

o para encontrar el posible orden de un modelo AR(p).

You might also like