Cosmological Parameter Estimation with Sequential Linear Simulation-based Inference
Cosmological Parameter Estimation with Sequential Linear Simulation-based Inference
W. J. Handley
Kavli Institute for Cosmology, University of Cambridge,
Madingley Road, Cambridge, CB3 0HA
Key words: cosmology, Bayesian data analysis, simulation-based inference, CMB power spectrum
III, several toy models are investigated using LSBI, and one may obtain a numerical estimation of the likelihood
in Section III B the method is applied to parameter es- through the simulator by learning the probability density
timation for the Cosmic Microwave Background (CMB) (i)
of the pairs {θ(i) , Dsim } for a sufficient number of simu-
power spectrum. lator runs. Here, we follow this strategy, except that
the assumption of linearity in Eq. 2 avoids the need for
machine-learning tools. This linear analysis applied to
II. THEORY SBI is not available in the literature, although some re-
cent works are similar, such as SELFI [20] or MOPED
A. The Linear Approximation [21].
We first draw k parameter vectors {θ(i) }; since we
Let us consider a d-dimensional dataset D described are estimating the likelihood, we may draw these from
by a model M with n parameters θ = {θi }. We assume an arbitrary distribution that does not need to be the
a Gaussian prior θ|M ∼ N (µ, Σ) with known mean and model prior π(θ). Then, for each θ(i) the simulator is
covariance. The likelihood LD (θ) for an arbitrary model run to produce the corresponding data vector D(i) . We
is generally intractable, but in this paper, we approx- define the first- and second-order statistics
imate it as a homoscedastic Gaussian (thus neglecting
k k
any parameter-dependence of the covariance matrix), 1 X (i) 1 X (i)
θ= θ , D= D , (8)
D|θ, M ∼ N (M(θ), C). (1) k i=1 k i=1
The average can be computed by drawing N exact sam- • the model M(θ) is approximated as a linear func-
ples (m(I) , M (I) , C (I) ) from Eqs. 13 – 15, where N is tion of the parameters.
large enough to guarantee convergence. For N > 1, the • the likelihood LD (θ) can be accurately approxi-
resulting posterior is a Gaussian mixture of N compo- mated by a homoscedastic Gaussian distribution
nents. Since each sample is independent, this calculation (Eq. 1),
can be made significantly faster by parallelization, allow-
In this section, we test the resilience of LSBI against devi-
ing a large N without much effect on the computational
ations from these assumptions by applying the procedure
efficiency of the calculation.
to several toy models, as well as the CMB temperature
power spectrum. These toy models were implemented
C. Sequential LSBI with the help of the Python package lsbi, currently un-
der development by W.J. Handley, and tools from the
scipy library. To simulate the cosmological power spec-
As discussed in Section II A, the linear approximation trum data, we use the cmb tt neural emulator from Cos-
is only applicable within a localized region surrounding moPowerJAX [23, 24].
the fiducial point θ0 . Given that the prior distribution is
typically broad, this condition is not often satisfied. Con- In addition to LSBI, the parameter posteriors are
sequently, when dealing with non-linear models, LSBI also calculated via nested sampling with dynesty [25–27]
will truncate the model to its linear expansion, thereby for comparison. The plots were made with the software
computing the posterior distribution corresponding to an getdist [28]. Unless otherwise stated, the calculations
‘effective’ linear likelihood function. This truncation re- in this section use broad uniform priors.
sults in a less constraining model, leading to a broader
‘effective’ posterior distribution relative to the ‘true’ pos-
terior. A. Toy Models
However, since the simulation parameters {θ(i) } can
be drawn from any non-singular distribution, indepen- For simplicity, the prior on the parameters is a stan-
dent of the prior, LSBI can be performed on a set of dard normal θ ∼ N (0, In ), and the samples for the sim-
samples generated by simulations that are proximal to ulations are taken directly from this prior. To generate
the observed data, i.e., a narrow distribution with θ0 the model, we draw the entries of m from a standard
near the true parameter values. A natural choice for normal, whereas the entries of M have mean 0 and stan-
this distribution is the ‘effective’ LSBI posterior. This dard deviation 1/d. The covariance C is drawn from
leads to the concept of Sequential LSBI, wherein LSBI is W −1 (σ 2 Id , d + 2), where σ = 0.5. The number of sam-
iteratively applied to simulation samples drawn from the ples N taken from posterior distributions of m, M , and C
posterior distribution of the previous iteration, with the depends on the dataset’s dimensionality d; generally, we
initial iteration corresponding to the prior distribution. choose the highest value allowing for manageable compu-
tation time.
It is worth noting that this method suffers from two
disadvantages compared to plain LSBI. Firstly, the al- Our starting point is a 50-dimensional dataset with a
gorithm is no longer amortized, as subsequent iterations quadratic model of n = 4 parameters,
depend on the choice of Dobs . Secondly, as the sampling
distribution becomes narrower, Θ becomes smaller, re- M(θ) = m + M θ + θT Q θ (21)
sulting in a broader distribution for M . Thus, the value
of N may need to be increased accordingly. and Gaussian likelihood, where m and M are as above,
and Q is a n × d × n matrix with entries drawn form
The evidence may be evaluated similarly. Thus, if N (0, 1/d). The noise is now Gaussian with covariance C.
the procedure is repeated for a different model M′ , the At each round, we perform LSBI with k = 2500 simula-
Bayes’ ratio between the two models may be calculated, tions to obtain an estimate for the posterior distribution,
⟨p(Dobs |M)⟩m,M,C where the sampling distribution of the parameter sets
B= (20) {θ(i) } is the posterior of the previous round. The poste-
⟨p(Dobs |M′ )⟩m′ ,M ′ ,C ′
rior is calculated for a set of ‘observed’ data, which are
Nevertheless, this calculation is inefficient for large determined by applying the model and noise to a set of
datasets, so a data compression algorithm is proposed ‘real’ parameters θ∗ . We also calculate the KL divergence
in Appendix B, although it is not investigated further in (DKL ) between the prior and each posterior, which will
this paper. help us determine the number of rounds of LSBI required
4
to obtain convergent results. The posterior distribution sequential LSBI. Furthermore, the final DKL between the
is also computed using nested sampling for comparison. prior and posterior distributions approaches the values
obtained using nested sampling / rejection ABC meth-
Figure 1 illustrates the outcomes of these simula-
ods. Nevertheless, the distributions show some discrep-
tions. The first iteration of LSBI indeed produces an
excessively broad posterior, which subsequent iterations ancy, illustrating the fact that non-Gaussian noise can
rapidly improve upon. Figure 2 confirms that after four affect the accuracy of LSBI. The results are less satisfac-
iterations, the Kullback-Leibler divergence between the tory for Laplacian noise, as the DKL does not converge to
prior and the LSBI posterior converges to that calculated a value within the error bars of the nested-sampling esti-
via nested sampling, with no appreciable discrepancy as mation. On the other hand, the lower value of the DKL
expected for Gaussian noise. estimated via rejection ABC for uniform noise compared
to the LSBI values after round 3 is probably due to the
In addition, we test the performance of LSBI for non- inaccuracy of ABC as a posterior estimation method.
Gaussian noise shapes. The cases considered are uniform,
Student-t with d + 2 degrees of freedom, and asymmet- We note that the distributions considered here have
ric Laplacian noise. The model is also given by Eq. 21, well-defined first and second moments, so they can be
approximated by a Gaussian. Although not considered
and the uncertainty in these distributions is defined di-
in this paper, there exist distributions with an undefined
rectly in terms of the model covariance C. The posterior
covariance, such as the Cauchy distribution (Student-t
distribution is also computed using nested sampling for
with one degree of freedom). In these cases, it has been
the Laplacian and Student-t cases, whereas that of the
checked that LSBI fails to predict a posterior, instead
uniform likelihood is obtained through an Approximate
returning the original prior.
Bayesian Computation (rejection ABC).
The one- and two-dimensional LSBI posteriors for the
models with non-Gaussian error are shown in Figures 3
and 4. The results demonstrate that the posteriors con- B. The CMB Power Spectrum
verge to a stable solution after approximately 4 rounds of
In this section, we test the performance of LSBI on a
pseudo-realistic cosmological dataset. In the absence of
Prior
generative Planck likelihoods, we produce the simulations
Round 1 through CosmoPowerJAX [23, 24], a library of machine
Round 2 learning-based inference tools and emulators. In particu-
Round 3
Round 4 lar, we use the cmb tt probe, which takes as input the six
Round 5 ΛCDM parameters: the Hubble constant H0 , the densi-
Nested Sampling
ties of baryonic matter Ωb h2 and cold dark matter Ωc h2 ,
2 where h = H0 /100km s−1 Mpc−1 , the re-ionization op-
tical depth τ , and the amplitude As and slope ns of the
0
1
0
2
2
8
7
2
KL Divergence
0
6
3
2
5
2 0 2 2 0 2 2 0 2 2 0 2
0 1 2 3 4
3
FIG. 1. Prior and posterior distributions on the parame-
ters for a 50-dimensional dataset described by a non-linear 1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0
Number of Rounds
4-parameter model with Gaussian error. Each round corre-
sponds to the output of LSBI after k = 2500 simulations,
where the sampling distribution of the parameter sets {θ(i) } FIG. 2. DKL between the prior and posterior for each round
is the posterior of the previous round. The result of nested of Sequential LSBI for the data displayed in Figure 1 The
sampling is also shown. The dashed lines indicate the values black line corresponds to the value computed via nested sam-
of the ‘real’ parameters θ∗ . pling; the estimated error is also shown as a gray band.
5
2 2 2
0 0 0
1
1
2 2 2
2 2 2
0 0 0
2
2
2 2 2
2 2 2
0 0 0
3
3
2 2 2
2 0 2 2 0 2 2 0 2 2 0 2 2 0 2 2 0 2 2 0 2 2 0 2 2 0 2 2 0 2 2 0 2 2 0 2
0 1 2 3 0 1 2 3 0 1 2 3
FIG. 3. Prior and posterior distributions on the parameters for a 50-dimensional dataset described by a linear 4-parameter
model and non-Gaussian error with σ ≈ 0.5. The posteriors are computed from k = 106, 500, 2500, and 10000 samples
drawn from the simulated likelihood. The dashed lines indicate the values of the underlying parameters θ∗ ; (left) uniform
noise; (centre) Student-t noise; (right) asymmetric Laplacian noise. The posterior distribution is also computed using nested
sampling for the Laplacian and Student-t cases, whereas that of the uniform likelihood is obtained through an Approximate
Bayesian Computation (rejection ABC).
7
7
7
6 6
6
KL Divergence
KL Divergence
KL Divergence
5 5
5
4
4
4
3
3
3
2
1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0 1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0 1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0
Number of Rounds Number of Rounds Number of Rounds
FIG. 4. Kullback-Leibler divergence between the prior and posterior on the parameters as a function of the number of
simulations; (left) uniform noise; (center) Student-t noise; (right) asymmetric Laplacian noise. The black line corresponds to
the value computed via nested sampling / rejection ABC; the estimated error is also shown as a gray band.
put is the predicted CMB temperature power spectrum The output of this calculation is shown in Figs. 6 and
7. The first figure displays the prior and rounds 1 and
ℓ 2, while the second shows rounds 3 to 5; in both cases
1 X
Cℓ = |aℓ,m |2 , 2 ⩽ ℓ ⩽ 2058 (22) the nested sampling result is shown. It can be noted by
2ℓ + 1
m=−ℓ eye that the posterior coincides well with the result of
nested sampling after four to five rounds of LSBI. This
where aℓ,m are the coefficients of the harmonic expansion suggests that, although the CMB power spectrum is not
of the CMB signal. To this data, we add the standard well approximated by a linear model at first instance,
scaled χ2 noise sequential LSBI succeeds at yielding a narrow sampling
2ℓ + 1 distribution about θ∗ , thus iteratively approximating the
Cℓ ∼ χ2 (2ℓ + 1). (23) correct posterior.
C ℓ + Nℓ
Figure 5 displays the evolution of the KL divergence
We apply several rounds of Sequential LSBI, each up to 8 rounds of LSBI. This result provides further evi-
drawing k = 104 simulations from the emulator, but dence that sequential LSBI converges after O(1) rounds,
keep N = 100 since a larger number is computationally thus keeping the total number of simulations within
unmanageable without parallelization. The parameter O(104 ). Nevertheless, we note that the DKL is slightly
samples are drawn from the prior displayed in Eqs. C1 overestimated, suggesting that the LSBI posterior over-
and C2 and, as before, the observed data is generated by estimates the true distribution to a small extent. Re-
running the simulator on a known set of parameters θ∗ . producing this calculation with smaller values of N , we
The posterior is also obtained by nested sampling with have noticed that the overconfidence increases as N is de-
dynesty. creased. Therefore, as discussed in Sections II B and II C,
6
17.5
15.0 V. MATERIALS
12.5
The source code for the data and plots shown in this
10.0 paper can be found at https://github.com/ngm29/astro-
7.5 lsbi/tree/main.
1 2 3 4 5 6 7 8
Number of Rounds
FIG. 5. DKL between the prior and posterior for each round
of Sequential LSBI for the simulated CMB data, displayed
in Figures 6 and 7. The black line corresponds to the value
computed via nested sampling. The estimated error nested
sampling is also shown as a gray band.
IV. CONCLUSION
Prior
Round 1
Round 2
Ground Truth
0.5
ch 2
0.0
0.5
1.2
1.0
0.8
h
0.6
0.4
0.2
0.3
0.2
0.1
0.0
0.1
0.2
1.5
ln(1010As)
1.0
0.5
10
5
ns
5
0.00 0.02 0.04 0.5 0.0 0.5 0.25 0.50 0.75 1.00 1.25 0.1 0.0 0.1 0.2 0.3 0.5 1.0 1.5 5 0 5 10
bh 2 ch 2 h ln(1010As) ns
FIG. 6. The plot displays the two-dimensional posterior distributions given by the first two rounds of sequential LSBI, where
each round corresponds to the output of LSBI after k = 104 simulations. The prior distribution and the result of nested
sampling on the dataset (labeled ”Ground Truth”) are also shown. The dashed lines indicate the values of the ‘real’ parameters
θ∗ .
8
Round 3
Round 4
Round 5
Ground Truth
0.205
ch 2
0.200
0.195
0.190
0.76
0.74
h
0.72
0.12
0.10
0.08
0.06
0.04
0.955
ln(1010As)
0.950
0.945
0.940
0.935
3.15
3.10
ns
3.05
0.0225 0.0230 0.192 0.200 0.208 0.72 0.74 0.76 0.06 0.08 0.10 0.12 0.94 0.95 3.05 3.10 3.15
bh 2 ch 2 h ln(1010As) ns
FIG. 7. The plot displays the two-dimensional posterior distributions given by rounds three through five of sequential LSBI,
together with the result of nested sampling on the dataset (labeled ”Ground Truth”). The dashed lines indicate the values of
the ‘real’ parameters θ∗ .
9
Consider a simulator SM which emulates a model M that can be approximated linearly. We ignore the values of
the hyper-parameters m, M , and C, but we may infer them by performing k independent simulations S = {(Di , θi )},
where the {θi } may be drawn from an arbitrary distribution. Defining the sample covariances and means for the data
and parameters,
X X
Θ = k1 (θi − θ)(θi − θ)T , ∆ = k1 (Di − D)(Di − D)T , (A1)
i i
X X X
Ψ= 1
k (Di − D)(θi − θ)T , D= 1
k Di , θ= 1
k θi , (A2)
i i i
then, after some algebra, the joint likelihood for the simulations can be written as:
P
log p({Di }|{θi }, m, M, C) ≡ i log p(Di |θi , m, M, C)
= − k2 log |2πC| − 21 i (Di − m − M θi )T C −1 (Di − m − M θi )
P
T
= − k2 log |2πC| − 21 m − (D − M θ) (C/k)−1 m − (D − M θ)
− k2 tr (∆ − ΨΘ−1 ΨT )C −1 .
We choose the conjugate prior m|M, C, {θ(i) } ∼ N (0, C), giving the posterior
iT i
1 1h k −1
h
k
p(m|M, C, S) = q exp − m − k+1 (D − M θ) [C/(k + 1)] m− k+1 (D − M θ) . (A3)
C
(2π)d | k+1 | 2
Hence, we find a new distribution, with m integrated out (the running evidence),
− k2 tr (∆ − ΨΘ−1 ΨT )C −1
where
1 T 1 T 1 T
e ≡Θ+
Θ e ≡Ψ+ e ≡∆+
k+1 θ θ , Ψ k+1 D θ , ∆ k+1 D D . (A4)
We choose the conjugate prior M |C, {θ(i) } ∼ N (0, C, Θ−1 ), giving the posterior
( )!
1 1 h iT C −1 h i
p(M |C, S) = q exp − tr Θ∗ M − ΨΘ e −1 e −1
M − ΨΘ , (A5)
∗ ∗
(2π)nd | C |n |Θ−1 |d 2 k
k ∗
10
where
T
e + 1Θ =
Θ∗ ≡ Θ k+1 1
k k Θ + k+1 θ θ . (A6)
p({Di }|{θi }, C)
p(C|S) = · p C, {θ(i) }
p({Di }|{θi })
|C|−(ν+d+1)/2
o
1 nh e e h −1 e −1 i
−1 e T
i
−1
p(C|S) = exp − tr k ∆ − Ψ Θ∗ + Θ − Θ Ψ + C0 C (A7)
N (S) 2
p({Di }|{θi }, C)
log p({Di }|{θi }) ≡ · p(C)
p(C|S)
ν0
= − kd d
log(k + 1) − dn ν
2 log π − 2 2 log k + log Γd [ 2 ]/Γd [ 2 ]
n h i ν o
e −1 Θ| − 1 log k ∆ − Ψ Θ−1
+ d2 log |Θ −1 −1 T
2 ∗ + Θe − Θ Ψ + C0 |C0 |ν0 .
Finally, we can compute the total evidence for the simulations, where we assume that the parameter samples have
been drawn from a Gaussian, θ(i) ∼ N (θ, Θ)
= − kd d
log k + log Γd [ ν2 ]/Γd [ ν20 ] − 12 nk
dn
2 log π − 2 log(k + 1) − 2
n h i ν o
d e −1 Θ| − k
log |2πΘ| − 21 log k ∆ − Ψ Θ−1 −1 −1 T
+ 2 log |Θ 2 ∗ + Θ
e − Θ Ψ + C0 |C0 |ν0 .
If the implicit likelihood is inferred for a different model M′ , the Bayes’ ratio between the two models may be
calculated,
⟨p(Dobs |M)⟩m,M,C
B= (B1)
⟨p(Dobs |M′ )⟩m′ ,M ′ ,C ′
However, this calculation requires a d × d matrix to be inverted (see Eq. 6), which scales as O(dα × N ), where
2 < α ⩽ 3 depending on the algorithm used. To increase the computational efficiency, Alsing et. al. [29, 30] and
Heavens et. al. [21] remark that for a homoscedastic Gaussian likelihood, the data may be mapped into a set of n
summary statistics via the linear compression
D 7→ M T C −1 (D − m) (B2)
11
without information loss. In our case, the Gaussian likelihood is an approximation, so we may only claim lossless
compression to the extent that the approximation is good. The advantage of this method is that C −1 may be drawn
directly from the distribution
and thus, save for one-time inversion of the scale matrix, the complexity of the compression is no more than O(n ×
d2 × N ).
We propose a slightly different data compression scheme,
The matrix Σ + Γ is only n × n, so if we assume that n ≪ d, this represents a potential reduction in computational
complexity.
Appendix C: Priors
Prior for the ΛCDM parameters, θCMB = (Ωb h2 , Ωc h2 , τ, ln 1010 As , ns , H0 )
µCMB = (2.22 × 10−2 , 0.120, 6.66 × 10−2 , 3.05, 0.964 × 10−1 , 67.3), (C1)
ΣCMB = diag(1.05 × 10−3 , 8.28 × 10−3 , 3.47 × 10−2 , 1.47 × 10−1 , 2.64 × 10−2 , 3.38). (C2)
12
[1] K. Cranmer, J. Brehmer, and G. Louppe, The frontier of D. Grimm, and L. Tortorelli, Simulation-based inference
simulation-based inference, Proceedings of the National of deep fields: galaxy population model and redshift
Academy of Sciences 117, 30055 (2020). distributions, Journal of Cosmology and Astroparticle
[2] D. B. Rubin, Bayesianly justifiable and relevant fre- Physics 2024 (05), 049.
quency calculations for the applied statistician, The An- [16] C. P. Novaes, L. Thiele, J. Armijo, S. Cheng, J. A. Cow-
nals of Statistics , 1151 (1984). ell, G. A. Marques, E. G. Ferreira, M. Shirasaki, K. Os-
[3] P. Marjoram, J. Molitor, V. Plagnol, and S. Tavaré, ato, and J. Liu, Cosmology from hsc y1 weak lensing with
Markov chain monte carlo without likelihoods, Proceed- combined higher-order statistics and simulation-based in-
ings of the National Academy of Sciences 100, 15324 ference, arXiv preprint arXiv:2409.01301 (2024).
(2003). [17] S. Fischbacher, B. Moser, T. Kacprzak, J. Herbel, L. Tor-
[4] S. A. Sisson, Y. Fan, and M. M. Tanaka, Sequential torelli, U. Schmitt, A. Refregier, and A. Amara, galsbi :
monte carlo without likelihoods, Proceedings of the Na- A python package for the galsbi galaxy population model,
tional Academy of Sciences 104, 1760 (2007). arXiv preprint arXiv:2412.08722 (2024).
[5] G. Papamakarios and I. Murray, Fast ε-free inference of [18] D. Castelvecchi, Can we open the black box of ai?, Nature
simulation models with bayesian conditional density es- News 538, 20 (2016).
timation, Advances in neural information processing sys- [19] J. Hermans, A. Delaunoy, F. Rozet, A. Wehenkel,
tems 29 (2016). V. Begy, and G. Louppe, A trust crisis in simulation-
[6] J. Alsing, T. Charnock, S. Feeney, and B. Wandelt, Fast based inference? your posterior approximations can be
likelihood-free cosmology with neural density estimators unfaithful, arXiv preprint arXiv:2110.06581 (2021).
and active learning, Monthly Notices of the Royal Astro- [20] F. Leclercq, W. Enzi, J. Jasche, and A. Heavens, Pri-
nomical Society 488, 4440 (2019). mordial power spectrum and cosmology from black-box
[7] A. Cole, B. K. Miller, S. J. Witte, M. X. Cai, M. W. galaxy surveys, Monthly Notices of the Royal Astronom-
Grootes, F. Nattino, and C. Weniger, Fast and credible ical Society 490, 4237 (2019).
likelihood-free cosmology with truncated marginal neural [21] A. F. Heavens, R. Jimenez, and O. Lahav, Massive loss-
ratio estimation, Journal of Cosmology and Astroparticle less data compression and multiple parameter estimation
Physics 2022 (09), 004. from galaxy spectra, Monthly Notices of the Royal As-
[8] P. Lemos, M. Cranmer, M. Abidi, C. Hahn, M. Eick- tronomical Society 317, 965 (2000).
enberg, E. Massara, D. Yallup, and S. Ho, Robust [22] A. K. Gupta and D. K. Nagar, Matrix variate distribu-
simulation-based inference in cosmology with bayesian tions (Chapman and Hall/CRC, 2018).
neural networks, Machine Learning: Science and Tech- [23] D. Piras and A. S. Mancini, Cosmopower-jax: high-
nology 4, 01LT01 (2023). dimensional bayesian inference with differentiable cos-
[9] G. Papamakarios, Neural density estimation mological emulators, arXiv preprint arXiv:2305.06347
and likelihood-free inference, arXiv preprint (2023).
arXiv:1910.13233 (2019). [24] A. Spurio Mancini, D. Piras, J. Alsing, B. Joachimi,
[10] S. Dupourqué, N. Clerc, E. Pointecouteau, D. Eckert, and M. P. Hobson, Cosmopower: emulating cosmologi-
S. Ettori, and F. Vazza, Investigating the turbulent hot cal power spectra for accelerated bayesian inference from
gas in x-cop galaxy clusters, Astronomy & Astrophysics next-generation surveys, Monthly Notices of the Royal
673, A91 (2023). Astronomical Society 511, 1771 (2022).
[11] M. Gatti, N. Jeffrey, L. Whiteway, J. Williamson, [25] J. S. Speagle, dynesty: a dynamic nested sampling pack-
B. Jain, V. Ajani, D. Anbajagane, G. Giannini, C. Zhou, age for estimating bayesian posteriors and evidences,
A. Porredon, et al., Dark energy survey year 3 results: Monthly Notices of the Royal Astronomical Society 493,
Simulation-based cosmological inference with wavelet 3132 (2020).
harmonics, scattering transforms, and moments of weak [26] S. Koposov, J. Speagle, K. Barbary, G. Ashton, E. Ben-
lensing mass maps. validation on simulations, Physical nett, J. Buchner, C. Scheffler, B. Cook, C. Talbot,
Review D 109, 063534 (2024). J. Guillochon, et al., joshspeagle/dynesty: v2. 0.0, Zen-
[12] M. Crisostomi, K. Dey, E. Barausse, and R. Trotta, odo (2022).
Neural posterior estimation with guaranteed exact cover- [27] E. Higson, W. Handley, M. Hobson, and A. Lasenby, Dy-
age: The ringdown of gw150914, Physical Review D 108, namic nested sampling: an improved algorithm for pa-
044029 (2023). rameter estimation and evidence calculation, Statistics
[13] K. Christy, E. J. Baxter, and J. Kumar, Applying and Computing 29, 891 (2019).
simulation-based inference to spectral and spatial infor- [28] A. Lewis, Getdist: a python package for analysing monte
mation from the galactic center gamma-ray excess, arXiv carlo samples, arXiv preprint arXiv:1910.13970 (2019).
preprint arXiv:2402.04549 (2024). [29] J. Alsing and B. Wandelt, Generalized massive optimal
[14] J. Harnois-Deraps, S. Heydenreich, B. Giblin, N. Mar- data compression, Monthly Notices of the Royal Astro-
tinet, T. Troester, M. Asgari, P. Burger, T. Castro, nomical Society: Letters 476, L60 (2018).
K. Dolag, C. Heymans, et al., Kids-1000 and des-y1 [30] J. Alsing, B. Wandelt, and S. Feeney, Massive optimal
combined: Cosmology from peak count statistics, arXiv data compression and density estimation for scalable,
preprint arXiv:2405.10312 (2024). likelihood-free inference in cosmology, Monthly Notices
[15] B. Moser, T. Kacprzak, S. Fischbacher, A. Refregier, of the Royal Astronomical Society 477, 2874 (2018).