A pipeline for searching and fitting instrumental glitches in LISA data

Martina Muratore contact: [email protected] \addressii    Jonathan Gair \addressii    Olaf Hartwig \addressaeihan \addressunihan    Michael L. Katz \addres    Alexandre Toubiana \addressio \addrese
(May 26, 2025)
Abstract

Instrumental artefacts, such as glitches, can significantly compromise the scientific output of LISA. Our methodology employs advanced Bayesian techniques, including Reversible Jump Markov Chain Monte Carlo and parallel tempering to find and characterize glitches and astrophysical signals. The robustness of the pipeline is demonstrated through its ability to simultaneously handle diverse glitch morphologies and it is validated with a “Spritz”-type data set from the LISA Data Challenge. Our approach enables accurate inference on Massive Black Hole Binaries, while simultaneously characterizing both instrumental artefacts and noise. These results present a significant development in strategies for differentiating between instrumental noise and astrophysical signals, which will ultimately improve the accuracy and reliability of source population analyses with LISA.

I Introduction

The space-based gravitational wave detector, the Laser Interferometer Space Antenna (LISA) mission Amaro-Seoane et al. (2017); Colpi et al. (2024) was adopted as a mission by the European Space Agency in January 2024 and is scheduled for launch in 2035. One of its main objectives is the first direct observation of coalescing massive black holes binaries (MBHBs) with gravitational waves (GWs). Such observations will transform our understanding of the astrophysical processes leading to the formation of massive black holes and driving their coevolution with their host galaxies Gair et al. (2011); Sesana et al. (2011); Klein et al. (2016); Dayal et al. (2019); Barausse et al. (2020a); Toubiana et al. (2021); Chen et al. (2022); Fang and Yang (2023); Spadaro et al. (2025); Langen et al. (2024); Toubiana et al. (2024a). Moreover, these observations will reach signal-to-noise ratios (SNRs) of thousands, allowing for exquisite tests of general relativity Berti et al. (2006, 2016); Chamberlain and Yunes (2017); Barausse et al. (2020b); Maggio et al. (2020); Bhagwat et al. (2022); Corman et al. (2022); Toubiana et al. (2024b); Pitte et al. (2024). Finally, the possible observation of electromagnetic counterparts Mangiagli et al. (2022); Dong-Páez et al. (2023); Izquierdo-Villalba et al. (2024) would allow MBHBs to be used as “bright sirens” for the measurement of cosmological parameters, not only the Hubble constant, but possibly also the equation of state of dark energy, thanks to the wide redshift range of these observations Caprini and Tamanini (2016); Tamanini et al. (2016); Mangiagli et al. (2023). To achieve these science objectives, it is crucial to develop a robust data analysis pipeline for LISA.

Unlike current ground-based GW detectors such as LIGO LIGO Scientific Collaboration et al. (2015), VIRGO Acernese et al. (2015) and KAGRA Somiya (2012); Aso et al. (2013). LISA will be a signal-dominated detector with multiple GW signals overlapping in time and frequency. Indeed, in addition to MBHBs, signals from 107similar-toabsentsuperscript107\sim 10^{7}∼ 10 start_POSTSUPERSCRIPT 7 end_POSTSUPERSCRIPT Galactic binaries are expected to be present in the data stream, of which 104similar-toabsentsuperscript104\sim 10^{4}∼ 10 start_POSTSUPERSCRIPT 4 end_POSTSUPERSCRIPT will be individually resolved, and the rest will form a stochastic noise that will dominate over the instrumental noise around 1 mHz Nelemans et al. (2001); Ruiter et al. (2010); Korol et al. (2017); Lamberts et al. (2019); Korol et al. (2022); Toubiana et al. (2024c). The extreme-mass-ratio inspirals (EMRIs) of stellar-origin compact objects into massive black holes in the centres of galaxies will also generate signals in the LISA data that could last for the whole mission duration. The rate of EMRI events is very uncertain, ranging from a few to tens of thousands Gair et al. (2010); Amaro-Seoane (2018); Babak et al. (2017); Pan and Yang (2021); Seoane et al. (2024), and at the upper end of this range EMRIs could perhaps even form a stochastic foreground Bonetti and Sesana (2020); Pozzoli et al. (2023). Finally, the loudest and most massive of the stellar-origin black hole binary population being observed by ground-based detectors are also expected to be observable during their inspiral phase Sesana (2016); Wong et al. (2018); Gerosa et al. (2019); Toubiana et al. (2020); Buscicchio et al. (2021); Toubiana et al. (2022); Buscicchio et al. (2025) and a variety of cosmological backgrounds could significantly contribute to the total noise (see Auclair et al. (2023) and references therein). Moreover, we will not have a direct and independent measurement of the instrument noise Muratore et al. (2023), meaning that we will have to estimate its properties simultaneously with the parameters of the astrophysical sources. The simultaneous characterization of all the sources of multiple different types that are expected to be present in the LISA data, along with instrumental properties, is usually referred to as the LISA Global Fit Cornish and Crowder (2005); Littenberg and Cornish (2023); Katz et al. (2024); Strub et al. (2024); Deng et al. (2025).

When designing a methodology to characterize instrumental noise, it is crucial to avoid systematic errors arising from noise misinterpretation or mismodeling. The LISA precursor mission, LISA Pathfinder (LPF) Armano et al. (2024a), which shares technology with LISA, was affected by spurious signals of unknown origin, referred to as glitches. Since various types of glitches were detected in LPF Armano et al. (2022a), it is reasonable to expect that, unless they are fully understood and mitigated through hardware requirements, similar glitches may appear in LISA. The LISA hardware, particularly the Gravitational Reference Sensors (GRS), will be largely the same as in LPF, with only minor modifications to accommodate integration into the Movable Optical Sub-assembly (MOSA) and the LISA spacecraft Muratore (2021). Therefore, we anticipate encountering glitches similar to those seen in LPF—especially force impulses—which could impact LISA’s sensitivity.

Glitches are also commonly seen in the data collected by ground-based gravitational wave detectors Zevin et al. (2017). These glitches show a complex range of morphologies, in contrast to the glitches observed in LISA Pathfinder, which were all of a much simpler form Armano et al. (2022b). These differences most likely arise from the difference in the technology of the instruments and in their different operating environments. However, given the differences in the mission environment and duration from LISA and LPF, we should remain open to the possibility that LISA may encounter previously unobserved or more complex types of glitches. Regarding signals observed in ground-based detectors, these are typically very short in duration compared to their typical separation in time. For this reason, segments of data flagged as having glitches are normally excluded from scientific analysis. Some events, for example the first binary neutron star merger, GW170817 Abbott et al. (2017), were clearly identified as being astrophysical in origin, due to good data quality in one of the instruments, while having overlapping glitches in one or more of the other detectors. In those cases, the analysis either gated out the range of times when the glitch was present or fitted out the glitch using a flexible model Pankow et al. (2018). In the case of LISA, signal durations will be typically much longer than the average time between glitches, and so only the latter strategies will be available to us.

In this paper, we present a possible solution to address these types of artefacts—particularly force glitches acting on the test masses—within the framework of the Global Fit development for LISA. Our analysis is performed by using the open-source sampler Eryn taking advantage of its reversible jump (RJ) capabilities Karnesis et al. (2023); Katz and Roberts (2023). We demonstrate the feasibility of performing both searches and parameter estimation to fit multiple glitches of different shape, morphology and SNR along with instrumental noise and an astrophysical source, in this case an MBHB. With this methodology, we aim to contribute to the ongoing efforts to develop robust data analysis techniques for LISA, ensuring its capacity to distinguish between astrophysical signals and instrumental artefacts. We therefore show the following achievements:

  1. 1.

    Effective search, parameter estimation and fit of multiple glitches while recovering the LISA instrumental noise.

  2. 2.

    Methodology to distinguish between a glitch and an MBHB signal.

  3. 3.

    Preservation of parameter estimation accuracy for an MBHB signal in noisy data in the presence of one or multiple glitches, injected at various times—either before, after, or during the time of maximum overlap between the MBHB signal and the glitch.

This paper is organised as follows: in section II we describe the LPF main results in terms of measured and cataloged artefacts; in section III we introduce the models used in this paper for the instrumental noise, glitches, and the MBHB signal; in section IV we provide details on the Bayesian formalism used, including the likelihood and the sampling methodology; in section V, we present an example where we generate our own data containing only glitches and instrumental noise. This demonstrates the algorithm’s ability to identify and fit multiple artefacts with different SNRs at the same time, while distinguishing them from stationary instrumental noise. Then, in section VI, we analyze the impact of glitches with different SNRs on MBHB parameter estimation when left unmodeled and assess the recovery of both glitch and MBHB parameters when performing a joint fit. Finally, in section VII, we apply our methodology to a “Light-Spritz” Data Challenge LISA LDC Team (nd). We address it as “Light-Spritz” because in this analysis we ignore the non-stationarities caused by gaps and Galactic binaries present in the dataset as these are subjects of a follow up study. Additionally, we replace the MBHB signal included in this dataset with one generated using the PhenomHM London et al. (2018) waveform model from the BBHx software package Katz et al. (2023), in order to eliminate systematics arising from differences between the simulated signal and the model used for the fit. We conclude the paper in section IX with a summary of our main results and perspectives for future work.

II Glitches as instrumental artefacts

LPF successfully demonstrated the feasibility of maintaining free-falling test masses with residual acceleration noise below the stringent requirements for LISA Armano et al. (2018). Operating from 2016 to 2017, LPF achieved unprecedented noise levels, however it also showed that measured(-experimental) noise could deviate quite significantly with respect to the derived noise models. In particular, the measured test-mass (TM) acceleration noise at frequencies below 104superscript10410^{-4}10 start_POSTSUPERSCRIPT - 4 end_POSTSUPERSCRIPT is still not fully understood Armano et al. (2018, 2024b). Also, transient acceleration glitches were observed during the mission Sala (2023); Castelli (2020). They spanned a wide amplitude range - transferring impulses from femto-Newton seconds (fN·s) to nano-Newton seconds (nN·s) to the test-masses - and exhibited diverse durations, from seconds to hours. These glitches fell into two main categories: rapid transients in the interferometric readout and longer-lasting forces acting on the test masses. Glitches in the second category arose from spurious impulsive forces of unknown origin acting on the test masses, which occurred approximately once per day, following a Poisson distribution. These impulses exhibited a distinctive time evolution, with a peak followed by an exponential decay over several hours Armano et al. (2022a). A detailed analysis of these phenomena showed that the observed signals could be accurately described using a fitted model Sala (2023), with minimal residual deviations. The strength of these impulsive events can be quantified through their SNRs. Out of the 432 events detected, 98 occurred during ordinary runs, characterized by standard operational conditions (22 or 11 degrees Celsius). The SNR distribution of glitches that occurred during ordinary runs peaked around moderate values of 10. In contrast, more glitches (334) were observed during the cold runs performed in May 2017, during which the instrument was operated at about 0 degree. All glitches observed during the cold runs were short and intense, and had higher SNR values (see Figure 7.12 in Sala (2023)). We refer the reader to Sala (2023) for a more detailed discussion of the possible origin of this phenomenon.

In this paper, we focus primarily on glitches with morphology similar to those observed in the ordinary runs since LISA will operate at standard operational conditions. These exhibit a well-defined and predictable pattern. Indeed, the analysis of LPF glitches leveraged phenomenological modelling, as single or double exponentials with a certain decay time and amplitude Sala (2023). Recently, Baghi et al. (2022) has shown that also shapelet functions could be used to fit LPF glitches and therefore, characterise their properties. These models, even if not directly connected to physical models as the ones derived in Sala (2023); Castelli (2020), laid the groundwork for future efforts to mitigate noise artefacts in LISA that are agnostic to the physical properties of the signals, such as the glitch decay time and impulse(-amplitude), which determine their shape.

To conclude this section, we note that one of the key challenges in GW data analysis is to develop methods that can robustly separate astrophysical signals from instrumental artefacts. To address this, we first need to derive a fiducial glitch model to use in our analysis. Therefore, in the next section, we present the shapelet function used to model the glitches and we relate it to the physical glitch model used to fit the observed LPF glitches. We also describe the waveform model used to describe the MBHB signal and the Time-Delay Interferometry (TDI) technique, which is applied to the raw data as a pre-processing step to reduce laser frequency noise before parameter estimation can be performed.

III Time delay interferometry, glitches, Massive black hole binaries and noise models

III.1 Time-delay interferometry

TDI is used in LISA to suppress laser frequency noise, which would otherwise exceed the GW signal by eight orders of magnitude, due to the un-equality of the interferometer arms. The LISA mission comprises three identical spacecraft arranged in a triangular configuration, separated by 2.5 million kilometres and linked by six active laser connections. While the LISA satellite formation is designed to remain roughly equilateral, gravitational and orbital dynamics cause the arm lengths to fluctuate by approximately 1.5%percent1.51.5\%1.5 % annually, with relative velocity drifts between spacecraft reaching up to 10 m/s Amaro-Seoane et al. (2017); Colpi et al. (2024). These time-varying and unequal arm lengths introduce laser frequency noise into the measurement channels. TDI achieves laser frequency noise reduction by forming specific linear combinations of time-delayed inter-spacecraft phase measurements Shaddock et al. (2004); Armstrong et al. (1999). Many combinations can be constructed Muratore et al. (2020), but for this work, we use the Michelson TDI combinations denoted as X,Y𝑋𝑌X,Yitalic_X , italic_Y, and Z𝑍Zitalic_Z; also because these are the data streams available in “Spritz”. Generally, when we refer to first-generation TDI Shaddock et al. (2004), this means measurement combinations that are constructed to cancel laser frequency noise for a static, but unequal arm length, constellation. In contrast, for second-generation TDI, time-varying and unequal arm lengths are taken into account Shaddock et al. (2003); Vallisneri (2005). In this paper, while we consider both generations, we assume in both cases, equal and non-varying arm lengths. Specifically, the first-generation TDI, is used for generating, and analyzing, frequency data and modelling the noise and glitches in section V. Whereas, the second-generation TDI is used here for modelling the noise, signal, and glitches in the analysis of the “Light--Spritz” Data Challenge LISA LDC Team (nd) in section VII.

Note that by assuming equal arm lengths, uncorrelated noise, and identical optical metrology system (OMS) and TM acceleration noise terms, the combinations X,Y𝑋𝑌X,Yitalic_X , italic_Y, and Z𝑍Zitalic_Z can be further transformed into three noise-orthogonal channels A,E𝐴𝐸A,Eitalic_A , italic_E, and T𝑇Titalic_T. This orthogonality simplifies the likelihood computation - in the frequency domain - by not having cross-correlation terms between channels. The channels A𝐴Aitalic_A, E𝐸Eitalic_E and T𝑇Titalic_T Hartwig and Muratore (2022a); Prince et al. (2002) that will be used here are:

AA\displaystyle{\rm A}roman_A =ZX2,E=X2Y+Z6,Tformulae-sequenceabsentZX2EX2YZ6T\displaystyle=\frac{{\rm Z}-{\rm X}}{\sqrt{2}}\;,\hskip 3.0pt{\rm E}=\frac{{% \rm X}-2{\rm Y}+{\rm Z}}{\sqrt{6}}\;,\hskip 3.0pt{\rm T}= divide start_ARG roman_Z - roman_X end_ARG start_ARG square-root start_ARG 2 end_ARG end_ARG , roman_E = divide start_ARG roman_X - 2 roman_Y + roman_Z end_ARG start_ARG square-root start_ARG 6 end_ARG end_ARG , roman_T =X+2Y+Z3.absentX2YZ3\displaystyle=\frac{{\rm X}+2{\rm Y}+{\rm Z}}{\sqrt{3}}.= divide start_ARG roman_X + 2 roman_Y + roman_Z end_ARG start_ARG square-root start_ARG 3 end_ARG end_ARG . (1)

In the case of first generation TDI, the channel X𝑋Xitalic_X is defined as:

X1subscriptX1\displaystyle{\rm X_{1}}roman_X start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT =(1D12D21)×(η13+D13η31)absent1subscript𝐷12subscript𝐷21subscript𝜂13subscript𝐷13subscript𝜂31\displaystyle=(1-D_{12}D_{21})\times(\eta_{13}+D_{13}\eta_{31})= ( 1 - italic_D start_POSTSUBSCRIPT 12 end_POSTSUBSCRIPT italic_D start_POSTSUBSCRIPT 21 end_POSTSUBSCRIPT ) × ( italic_η start_POSTSUBSCRIPT 13 end_POSTSUBSCRIPT + italic_D start_POSTSUBSCRIPT 13 end_POSTSUBSCRIPT italic_η start_POSTSUBSCRIPT 31 end_POSTSUBSCRIPT )
(1D13D31)×(η12+D12η21),1subscript𝐷13subscript𝐷31subscript𝜂12subscript𝐷12subscript𝜂21\displaystyle\qquad-(1-D_{13}D_{31})\times(\eta_{12}+D_{12}\eta_{21})\;,- ( 1 - italic_D start_POSTSUBSCRIPT 13 end_POSTSUBSCRIPT italic_D start_POSTSUBSCRIPT 31 end_POSTSUBSCRIPT ) × ( italic_η start_POSTSUBSCRIPT 12 end_POSTSUBSCRIPT + italic_D start_POSTSUBSCRIPT 12 end_POSTSUBSCRIPT italic_η start_POSTSUBSCRIPT 21 end_POSTSUBSCRIPT ) , (2)

and the same variable in second generation is defined as:

X2subscriptX2\displaystyle{\rm X_{2}}roman_X start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT =(1D12D21D12D21D13D31\displaystyle=(1-D_{12}D_{21}-D_{12}D_{21}D_{13}D_{31}= ( 1 - italic_D start_POSTSUBSCRIPT 12 end_POSTSUBSCRIPT italic_D start_POSTSUBSCRIPT 21 end_POSTSUBSCRIPT - italic_D start_POSTSUBSCRIPT 12 end_POSTSUBSCRIPT italic_D start_POSTSUBSCRIPT 21 end_POSTSUBSCRIPT italic_D start_POSTSUBSCRIPT 13 end_POSTSUBSCRIPT italic_D start_POSTSUBSCRIPT 31 end_POSTSUBSCRIPT
+D13D31D12D21D12D21)×(η13+D13η31)\displaystyle\qquad\quad+D_{13}D_{31}D_{12}D_{21}D_{12}D_{21})\times(\eta_{13}% +D_{13}\eta_{31})+ italic_D start_POSTSUBSCRIPT 13 end_POSTSUBSCRIPT italic_D start_POSTSUBSCRIPT 31 end_POSTSUBSCRIPT italic_D start_POSTSUBSCRIPT 12 end_POSTSUBSCRIPT italic_D start_POSTSUBSCRIPT 21 end_POSTSUBSCRIPT italic_D start_POSTSUBSCRIPT 12 end_POSTSUBSCRIPT italic_D start_POSTSUBSCRIPT 21 end_POSTSUBSCRIPT ) × ( italic_η start_POSTSUBSCRIPT 13 end_POSTSUBSCRIPT + italic_D start_POSTSUBSCRIPT 13 end_POSTSUBSCRIPT italic_η start_POSTSUBSCRIPT 31 end_POSTSUBSCRIPT )
(1D13D31D13D31D12D21\displaystyle\qquad-(1-D_{13}D_{31}-D_{13}D_{31}D_{12}D_{21}- ( 1 - italic_D start_POSTSUBSCRIPT 13 end_POSTSUBSCRIPT italic_D start_POSTSUBSCRIPT 31 end_POSTSUBSCRIPT - italic_D start_POSTSUBSCRIPT 13 end_POSTSUBSCRIPT italic_D start_POSTSUBSCRIPT 31 end_POSTSUBSCRIPT italic_D start_POSTSUBSCRIPT 12 end_POSTSUBSCRIPT italic_D start_POSTSUBSCRIPT 21 end_POSTSUBSCRIPT
+D12D21D13D31D13D31)×(η12+D12η21)\displaystyle\qquad\quad+D_{12}D_{21}D_{13}D_{31}D_{13}D_{31})\times(\eta_{12}% +D_{12}\eta_{21})+ italic_D start_POSTSUBSCRIPT 12 end_POSTSUBSCRIPT italic_D start_POSTSUBSCRIPT 21 end_POSTSUBSCRIPT italic_D start_POSTSUBSCRIPT 13 end_POSTSUBSCRIPT italic_D start_POSTSUBSCRIPT 31 end_POSTSUBSCRIPT italic_D start_POSTSUBSCRIPT 13 end_POSTSUBSCRIPT italic_D start_POSTSUBSCRIPT 31 end_POSTSUBSCRIPT ) × ( italic_η start_POSTSUBSCRIPT 12 end_POSTSUBSCRIPT + italic_D start_POSTSUBSCRIPT 12 end_POSTSUBSCRIPT italic_η start_POSTSUBSCRIPT 21 end_POSTSUBSCRIPT )
. (3)

The Y𝑌Yitalic_Y and Z𝑍Zitalic_Z variables differ from X𝑋Xitalic_X by cyclic permutations of the three satellites. The operator Dijsubscript𝐷𝑖𝑗D_{ij}italic_D start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT is a delay operator, corresponding to a constant time shift by Lijsubscript𝐿𝑖𝑗L_{ij}italic_L start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT and thus in frequency to multiplication by {Dij}=ei2πfLijsubscript𝐷𝑖𝑗superscript𝑒𝑖2𝜋𝑓subscript𝐿𝑖𝑗\mathcal{F}\{D_{ij}\}=e^{-i2\pi fL_{ij}}caligraphic_F { italic_D start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT } = italic_e start_POSTSUPERSCRIPT - italic_i 2 italic_π italic_f italic_L start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT end_POSTSUPERSCRIPT with Lijsubscript𝐿𝑖𝑗L_{ij}italic_L start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT representing the light travel time for a beam received at spacecraft i𝑖iitalic_i and emitted from spacecraft j𝑗jitalic_j, in seconds Heinzel et al. (2011). In this work we will assume the six arm-lengths to be equal and constant and therefore Lij=Lsubscript𝐿𝑖𝑗𝐿L_{ij}=Litalic_L start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT = italic_L. The ηijsubscript𝜂𝑖𝑗{\eta}_{ij}italic_η start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT are the so-called intermediary variables Hartwig (2021). They represent a synthesized interferometric test-mass to test-mass measurement, here expressed as a fractional frequency shift between the interfering laser beams. The first index i𝑖iitalic_i indicates the spacecraft where the measurement is performed at time t𝑡titalic_t, and the second index j𝑗jitalic_j indicates the distant spacecraft from which light was emitted at time tLij𝑡subscript𝐿𝑖𝑗t-L_{ij}italic_t - italic_L start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT. The intermediary variables ηijsubscript𝜂𝑖𝑗{\eta}_{ij}italic_η start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT contain the sum of the contributions from GW signals, glitches and instrumental noises, and so the TDI transfer function applies linearly to these different components.

III.2 Shapelet model

In this section, we describe how glitches are modeled and the LISA response to them. We model glitches using the exponential shapelet functions following Baghi et al. (2022) and Bergé et al. (2019). This is a family of exponentially damped functions that are eigen-wavefunctions of the normalized 1D-hydrogen atom. They can be defined in time domain as follows:

ψn(t)=2tnetnLn11(2tn)Θ(t),subscript𝜓𝑛𝑡2𝑡𝑛superscript𝑒𝑡𝑛subscriptsuperscript𝐿1𝑛12𝑡𝑛Θ𝑡\psi_{n}(t)=2\frac{t}{n}e^{-\frac{t}{n}}L^{1}_{n-1}\bigg{(}2\frac{t}{n}\bigg{)% }\Theta(t),italic_ψ start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ( italic_t ) = 2 divide start_ARG italic_t end_ARG start_ARG italic_n end_ARG italic_e start_POSTSUPERSCRIPT - divide start_ARG italic_t end_ARG start_ARG italic_n end_ARG end_POSTSUPERSCRIPT italic_L start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_n - 1 end_POSTSUBSCRIPT ( 2 divide start_ARG italic_t end_ARG start_ARG italic_n end_ARG ) roman_Θ ( italic_t ) , (4)

with Lnαsuperscriptsubscript𝐿𝑛𝛼L_{n}^{\alpha}italic_L start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_α end_POSTSUPERSCRIPT denoting the generalized Laguerre polynomials. A glitch perturbation is then modeled by a linear combination of shapelets:

g(t)=i=0PAiψni(tτiβi),𝑔𝑡superscriptsubscript𝑖0𝑃subscript𝐴𝑖subscript𝜓subscript𝑛𝑖𝑡subscript𝜏𝑖subscript𝛽𝑖g(t)=\sum_{i=0}^{P}A_{i}\psi_{n_{i}}\bigg{(}\frac{t-\tau_{i}}{\beta_{i}}\bigg{% )},italic_g ( italic_t ) = ∑ start_POSTSUBSCRIPT italic_i = 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_P end_POSTSUPERSCRIPT italic_A start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT italic_ψ start_POSTSUBSCRIPT italic_n start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_POSTSUBSCRIPT ( divide start_ARG italic_t - italic_τ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_ARG start_ARG italic_β start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_ARG ) , (5)

with P+1𝑃1P+1italic_P + 1 being the number of shapelets. The parameters to be estimated for each shapelet component are the scale parameter βisubscript𝛽𝑖\beta_{i}italic_β start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT, which represents the characteristic damping time, the starting (or injection) time of the shapelet τisubscript𝜏𝑖\tau_{i}italic_τ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT, and the shapelet amplitude Aisubscript𝐴𝑖A_{i}italic_A start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT. Note that Eq. (5) is given in units of acceleration, but considering that the phasemeter in LISA outputs phase or frequency, we must convert it to the appropriate units. To achieve this, we utilize the generating function (see Wikipedia contributors (nd) for reference and Appendix A.1) and integrate Eq. (5) -after substituting Eq. (4)- to derive the fractional frequency expression for a single exponential shapelet with n0=1subscript𝑛01n_{0}=1italic_n start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT = 1 as:

Δνgν0(t,τ0,β0,A0)=2A0β0c(1et+τ0β0(t+β0τ0)β0)Θ(tτ0β0),Δsubscript𝜈𝑔subscript𝜈0𝑡subscript𝜏0subscript𝛽0subscript𝐴02subscript𝐴0subscript𝛽0𝑐1superscript𝑒𝑡subscript𝜏0subscript𝛽0𝑡subscript𝛽0subscript𝜏0subscript𝛽0Θ𝑡subscript𝜏0subscript𝛽0\begin{split}&\frac{\Delta\nu_{g}}{\nu_{0}}(t,\tau_{0},\beta_{0},A_{0})=\\ &\frac{2A_{0}\beta_{0}}{c}\bigg{(}1-\frac{e^{\frac{-t+\tau_{0}}{\beta_{0}}}(t+% \beta_{0}-\tau_{0})}{\beta_{0}}\bigg{)}\Theta\bigg{(}\frac{t-\tau_{0}}{\beta_{% 0}}\bigg{)},\end{split}start_ROW start_CELL end_CELL start_CELL divide start_ARG roman_Δ italic_ν start_POSTSUBSCRIPT italic_g end_POSTSUBSCRIPT end_ARG start_ARG italic_ν start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_ARG ( italic_t , italic_τ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , italic_β start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , italic_A start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) = end_CELL end_ROW start_ROW start_CELL end_CELL start_CELL divide start_ARG 2 italic_A start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT italic_β start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_ARG start_ARG italic_c end_ARG ( 1 - divide start_ARG italic_e start_POSTSUPERSCRIPT divide start_ARG - italic_t + italic_τ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_ARG start_ARG italic_β start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_ARG end_POSTSUPERSCRIPT ( italic_t + italic_β start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT - italic_τ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) end_ARG start_ARG italic_β start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_ARG ) roman_Θ ( divide start_ARG italic_t - italic_τ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_ARG start_ARG italic_β start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_ARG ) , end_CELL end_ROW (6)

where we have considered that for our case α=1𝛼1\alpha=1italic_α = 1 Baghi et al. (2022). The shapelet model zero order depends thus on three parameters ϕ={τ0,A0,β0}italic-ϕsubscript𝜏0subscript𝐴0subscript𝛽0\phi=\{\tau_{0},A_{0},\beta_{0}\}italic_ϕ = { italic_τ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , italic_A start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , italic_β start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT }. We note that Eq. (6) is the same as the one derived in Baghi et al. (2022)111Private correspondence with the authors of Baghi et al. (2022) confirmed that there is a typo in Eqs. B2-B4 of their paper and the corrected expressions is consistent with what we present here.. This function is similar to a step function starting from a value of zero and increasing to:

limtΔνgν0(t,τ0,β0,A0)=2A0β0c.subscript𝑡Δsubscript𝜈𝑔subscript𝜈0𝑡subscript𝜏0subscript𝛽0subscript𝐴02subscript𝐴0subscript𝛽0𝑐\lim_{t\rightarrow\infty}\frac{\Delta\nu_{g}}{\nu_{0}}(t,\tau_{0},\beta_{0},A_% {0})=\frac{2A_{0}\beta_{0}}{c}.roman_lim start_POSTSUBSCRIPT italic_t → ∞ end_POSTSUBSCRIPT divide start_ARG roman_Δ italic_ν start_POSTSUBSCRIPT italic_g end_POSTSUBSCRIPT end_ARG start_ARG italic_ν start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_ARG ( italic_t , italic_τ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , italic_β start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , italic_A start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) = divide start_ARG 2 italic_A start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT italic_β start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_ARG start_ARG italic_c end_ARG . (7)

The shapelet model with i=0𝑖0i=0italic_i = 0 thus reduces to the way glitches were modeled in LPF. Although, since the most significant impact of glitches during the LPF mission was an effective overestimation of the measured LPF acceleration noise, they had been fitted in terms of acceleration (Eq. 7.1 in Sala (2023) and Appendix A.2) using the following model:

hg(t)=Δvτ2tetτΘ(t),t=tt0formulae-sequencesubscript𝑔𝑡Δ𝑣superscript𝜏2superscript𝑡superscript𝑒superscript𝑡𝜏Θsuperscript𝑡superscript𝑡𝑡subscript𝑡0h_{g}(t)=\frac{\Delta v}{\tau^{2}}t^{\prime}e^{\frac{-t^{\prime}}{\tau}}\Theta% (t^{\prime}),\qquad\qquad t^{\prime}=t-t_{0}italic_h start_POSTSUBSCRIPT italic_g end_POSTSUBSCRIPT ( italic_t ) = divide start_ARG roman_Δ italic_v end_ARG start_ARG italic_τ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG italic_t start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT italic_e start_POSTSUPERSCRIPT divide start_ARG - italic_t start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT end_ARG start_ARG italic_τ end_ARG end_POSTSUPERSCRIPT roman_Θ ( italic_t start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) , italic_t start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT = italic_t - italic_t start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT (8)

which, integrated in time to get the fractional frequency change, gives:

0tsuperscriptsubscript0𝑡\displaystyle\int_{0}^{t}∫ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT hg(t)cdtsubscript𝑔superscript𝑡𝑐𝑑superscript𝑡\displaystyle\frac{h_{g}(t^{\prime})}{c}dt^{\prime}divide start_ARG italic_h start_POSTSUBSCRIPT italic_g end_POSTSUBSCRIPT ( italic_t start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) end_ARG start_ARG italic_c end_ARG italic_d italic_t start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT
=Δvc(et0tτ(tτ+t0)τ+1)Θ(tt0),absentΔ𝑣𝑐superscript𝑒subscript𝑡0𝑡𝜏𝑡𝜏subscript𝑡0𝜏1Θ𝑡subscript𝑡0\displaystyle\hskip 14.22636pt=\frac{\Delta v}{c}\left(\frac{e^{\frac{t_{0}-t}% {\tau}}(-t-\tau+t_{0})}{\tau}+1\right)\Theta(t-t_{0}),= divide start_ARG roman_Δ italic_v end_ARG start_ARG italic_c end_ARG ( divide start_ARG italic_e start_POSTSUPERSCRIPT divide start_ARG italic_t start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT - italic_t end_ARG start_ARG italic_τ end_ARG end_POSTSUPERSCRIPT ( - italic_t - italic_τ + italic_t start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) end_ARG start_ARG italic_τ end_ARG + 1 ) roman_Θ ( italic_t - italic_t start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) , (9)

ΔvΔ𝑣\Delta vroman_Δ italic_v is the total transferred impulse per unit mass, t0subscript𝑡0t_{0}italic_t start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT is the injection time and τ𝜏\tauitalic_τ is the decay time. Therefore, comparing the shapelet model (Eq. 6) with the glitch model (Eq. III.2), the shapelet amplitude A0subscript𝐴0A_{0}italic_A start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT can be related to the glitch impulse ΔvΔ𝑣\Delta vroman_Δ italic_v accordingly to Δv=2A0β0Δ𝑣2subscript𝐴0subscript𝛽0\Delta v=2A_{0}\beta_{0}roman_Δ italic_v = 2 italic_A start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT italic_β start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT, while β0subscript𝛽0\beta_{0}italic_β start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT equals the glitch τ𝜏\tauitalic_τ, and τ0subscript𝜏0\tau_{0}italic_τ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT equals the glitch injection time t0subscript𝑡0t_{0}italic_t start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT. These relationships will be important in sections V.2 and VIII, where we will use them to define the priors on the shapelet for fitting the glitches present in the data. Specifically, the prior for the shapelet amplitude will be chosen based on this relationship such that smaller amplitude values will be allowed compared to the nominal LPF glitches amplitude.

Refer to caption
Figure 1: Impulse versus decay time of LPF glitches at different SNRs computed considering the LISA noise power spectral densities and the three TDI channels A𝐴Aitalic_A, E𝐸Eitalic_E and T𝑇Titalic_T.

Figure 1 shows the distribution of glitch parameters observed for LPF glitches. It is clear that relatively short glitches have the highest SNR. This is because short glitches have the majority of the power concentrated in the middle of the LISA frequency band, whereas long glitches concentrate the power at very low frequencies, where LISA is less sensitive. We note that SNRs for LISA are higher with respect to LPF (Fig. 1 versus Fig. 8.8 in Sala (2023)) for two reasons: the first is the different acceleration noise transfer functions between LISA and LPF, in particular, LISA shows a factor 4444 enhancement for the X𝑋Xitalic_X channels in the TM acceleration noise power spectral density (PSD), the second is that we compute the SNR for LISA assuming the three TDI channels (A𝐴Aitalic_A, E𝐸Eitalic_E and T𝑇Titalic_T) such that the SNR squared is doubled with respect to that in a single channel222The TDI channel T𝑇Titalic_T, or in general any channel null to GWs, being less sensitive to test-mass acceleration noise is also less sensitive to acceleration glitches. Therefore, their contribution to the SNR is negligible.. We show in Figure 2 two examples of shapelets with i=0𝑖0i=0italic_i = 0 and i=9𝑖9i=9italic_i = 9.

Refer to caption
Refer to caption
Figure 2: Shapelets with n0=1subscript𝑛01n_{0}=1italic_n start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT = 1 and n9=10subscript𝑛910n_{9}=10italic_n start_POSTSUBSCRIPT 9 end_POSTSUBSCRIPT = 10, where we have considered for both A0=5×107subscript𝐴05superscript107A_{0}=5\times 10^{-7}italic_A start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT = 5 × 10 start_POSTSUPERSCRIPT - 7 end_POSTSUPERSCRIPT, β0=1000subscript𝛽01000\beta_{0}=1000italic_β start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT = 1000 and τ0=1000subscript𝜏01000\tau_{0}=1000italic_τ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT = 1000. For the high order shapelet we assume all Ai,βisubscript𝐴𝑖subscript𝛽𝑖A_{i},\beta_{i}italic_A start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT , italic_β start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT and τisubscript𝜏𝑖\tau_{i}italic_τ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT to be equal to A0subscript𝐴0A_{0}italic_A start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , β0subscript𝛽0\beta_{0}italic_β start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT and τ0subscript𝜏0\tau_{0}italic_τ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT, respectively.

To conclude, the shapelet expression (Eq. 6) can be written directly in the frequency domain as follows333We have tested that using the analytical frequency domain function is sufficiently equivalent to performing a numerical fast Fourier transform of the time domain data.(see Appendix A.1):

Δν~gν0(f,τ0,β0,A0)=iA0β0e2ifπτ0cfπ(i2fπβ0)2.Δsubscript~𝜈𝑔subscript𝜈0𝑓subscript𝜏0subscript𝛽0subscript𝐴0𝑖subscript𝐴0subscript𝛽0superscript𝑒2𝑖𝑓𝜋subscript𝜏0𝑐𝑓𝜋superscript𝑖2𝑓𝜋subscript𝛽02\frac{\Delta\tilde{\nu}_{g}}{\nu_{0}}(f,\tau_{0},\beta_{0},A_{0})=\frac{iA_{0}% \beta_{0}e^{-2if\pi\tau_{0}}}{cf\pi(i-2f\pi\beta_{0})^{2}}.divide start_ARG roman_Δ over~ start_ARG italic_ν end_ARG start_POSTSUBSCRIPT italic_g end_POSTSUBSCRIPT end_ARG start_ARG italic_ν start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_ARG ( italic_f , italic_τ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , italic_β start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , italic_A start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) = divide start_ARG italic_i italic_A start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT italic_β start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT italic_e start_POSTSUPERSCRIPT - 2 italic_i italic_f italic_π italic_τ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_POSTSUPERSCRIPT end_ARG start_ARG italic_c italic_f italic_π ( italic_i - 2 italic_f italic_π italic_β start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG . (10)

This expression agrees with the physical model used to fit glitches in LPF data but expressed in fractional frequency units (Eq. 7.3 in Sala (2023) and Appendix A.2).

III.3 TDI transfer function for glitches and shapelets in frequency domain

We can express the single link measurement η~ij(ω)subscript~𝜂𝑖𝑗𝜔\tilde{\eta}_{ij}(\omega)over~ start_ARG italic_η end_ARG start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT ( italic_ω ), in Eqs. III.1,3 as:

η~ij(f)=Δν~gji(f)e2iπfL+Δν~gij(f),subscript~𝜂𝑖𝑗𝑓Δsubscript~𝜈subscript𝑔𝑗𝑖𝑓superscript𝑒2𝑖𝜋𝑓𝐿Δsubscript~𝜈subscript𝑔𝑖𝑗𝑓\tilde{\eta}_{ij}(f)=\Delta\tilde{\nu}_{g_{ji}}{(f)}e^{-2i\pi fL}+\Delta\tilde% {\nu}_{g_{ij}}{(f)},over~ start_ARG italic_η end_ARG start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT ( italic_f ) = roman_Δ over~ start_ARG italic_ν end_ARG start_POSTSUBSCRIPT italic_g start_POSTSUBSCRIPT italic_j italic_i end_POSTSUBSCRIPT end_POSTSUBSCRIPT ( italic_f ) italic_e start_POSTSUPERSCRIPT - 2 italic_i italic_π italic_f italic_L end_POSTSUPERSCRIPT + roman_Δ over~ start_ARG italic_ν end_ARG start_POSTSUBSCRIPT italic_g start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT end_POSTSUBSCRIPT ( italic_f ) , (11)

where the terms Δν~gij(f)Δsubscript~𝜈subscript𝑔𝑖𝑗𝑓\Delta\tilde{\nu}_{g_{ij}}{(f)}roman_Δ over~ start_ARG italic_ν end_ARG start_POSTSUBSCRIPT italic_g start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT end_POSTSUBSCRIPT ( italic_f ) indicates the contribution of a glitch (shapelet) acting on the test mass ij𝑖𝑗ijitalic_i italic_j and we have assumed constant and equal arm-lengths, Lij=Lsubscript𝐿𝑖𝑗𝐿L_{ij}=Litalic_L start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT = italic_L. Considering the expected glitch distributions from LISA Pathfinder, it is highly unlikely that multiple glitches happen simultaneously. We therefore assume that one of the two terms (Δνgij(f)Δsubscript𝜈subscript𝑔𝑖𝑗𝑓\Delta\nu_{g_{ij}}{(f)}roman_Δ italic_ν start_POSTSUBSCRIPT italic_g start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT end_POSTSUBSCRIPT ( italic_f ) or Δν~gji(f)Δsubscript~𝜈subscript𝑔𝑗𝑖𝑓\Delta\tilde{\nu}_{g_{ji}}{(f)}roman_Δ over~ start_ARG italic_ν end_ARG start_POSTSUBSCRIPT italic_g start_POSTSUBSCRIPT italic_j italic_i end_POSTSUBSCRIPT end_POSTSUBSCRIPT ( italic_f )) vanishes and that the other one contains a single glitch.

Considering now Eq. III.1, we can model the TDI applied to a glitch (shapelet) as a product between the glitch (shapelet) expression in the frequency domain (Eq. 10) and the transfer functions 𝒯X,Y,Zg(f)superscriptsubscript𝒯𝑋𝑌𝑍𝑔𝑓\mathcal{T}_{X,Y,Z}^{g}(f)caligraphic_T start_POSTSUBSCRIPT italic_X , italic_Y , italic_Z end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_g end_POSTSUPERSCRIPT ( italic_f ) which take into account TDI, i.e., g~X,Y,Z(f)=𝒯X,Y,Zg(f)Δν~g(f)/ν0subscript~𝑔𝑋𝑌𝑍𝑓superscriptsubscript𝒯𝑋𝑌𝑍𝑔𝑓Δsubscript~𝜈𝑔𝑓subscript𝜈0\tilde{g}_{X,Y,Z}(f)=\mathcal{T}_{X,Y,Z}^{g}(f){\Delta\tilde{\nu}_{g}}(f)/\nu_% {0}over~ start_ARG italic_g end_ARG start_POSTSUBSCRIPT italic_X , italic_Y , italic_Z end_POSTSUBSCRIPT ( italic_f ) = caligraphic_T start_POSTSUBSCRIPT italic_X , italic_Y , italic_Z end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_g end_POSTSUPERSCRIPT ( italic_f ) roman_Δ over~ start_ARG italic_ν end_ARG start_POSTSUBSCRIPT italic_g end_POSTSUBSCRIPT ( italic_f ) / italic_ν start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT, where g~X,Y,Z(f)subscript~𝑔𝑋𝑌𝑍𝑓\tilde{g}_{X,Y,Z}(f)over~ start_ARG italic_g end_ARG start_POSTSUBSCRIPT italic_X , italic_Y , italic_Z end_POSTSUBSCRIPT ( italic_f ) is the signal of a glitch (shapelet) in the TDI frequency domain data, where we have omitted the constant glitch parameters in the argument of the expression.

In this work we will always apply glitches on TM12𝑇subscript𝑀12TM_{12}italic_T italic_M start_POSTSUBSCRIPT 12 end_POSTSUBSCRIPT. Denoting by \mathcal{F}caligraphic_F the Fourier transform operator, and applying it to the delay operators gives D=e2iπfL𝐷superscript𝑒2𝑖𝜋𝑓𝐿\mathcal{F}{D}=e^{-2i\pi fL}caligraphic_F italic_D = italic_e start_POSTSUPERSCRIPT - 2 italic_i italic_π italic_f italic_L end_POSTSUPERSCRIPT. The transfer function for the X,Y𝑋𝑌X,Yitalic_X , italic_Y and Z𝑍Zitalic_Z first generation TDI variables of a glitch on TM12𝑇subscript𝑀12TM_{12}italic_T italic_M start_POSTSUBSCRIPT 12 end_POSTSUBSCRIPT is:

𝒯X1g(f)superscriptsubscript𝒯subscript𝑋1𝑔𝑓\displaystyle\mathcal{T}_{X_{1}}^{g}(f)caligraphic_T start_POSTSUBSCRIPT italic_X start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_g end_POSTSUPERSCRIPT ( italic_f ) =1+e8iLπfabsent1superscript𝑒8𝑖𝐿𝜋𝑓\displaystyle=-1+e^{-8iL\pi f}= - 1 + italic_e start_POSTSUPERSCRIPT - 8 italic_i italic_L italic_π italic_f end_POSTSUPERSCRIPT (12a)
𝒯Y1g(f)superscriptsubscript𝒯subscript𝑌1𝑔𝑓\displaystyle\mathcal{T}_{Y_{1}}^{g}(f)caligraphic_T start_POSTSUBSCRIPT italic_Y start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_g end_POSTSUPERSCRIPT ( italic_f ) =4ie4iLπfsin(2Lπf)absent4𝑖superscript𝑒4𝑖𝐿𝜋𝑓2𝐿𝜋𝑓\displaystyle=4ie^{-4iL\pi f}\sin(2L\pi f)= 4 italic_i italic_e start_POSTSUPERSCRIPT - 4 italic_i italic_L italic_π italic_f end_POSTSUPERSCRIPT roman_sin ( 2 italic_L italic_π italic_f ) (12b)
𝒯Z1g(f)superscriptsubscript𝒯subscript𝑍1𝑔𝑓\displaystyle\mathcal{T}_{Z_{1}}^{g}(f)caligraphic_T start_POSTSUBSCRIPT italic_Z start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_g end_POSTSUPERSCRIPT ( italic_f ) =0.absent0\displaystyle=0.= 0 . (12c)

The same type of derivation can be done for second-generation TDI using Eq. 3:

𝒯X2g(f)superscriptsubscript𝒯subscript𝑋2𝑔𝑓\displaystyle\mathcal{T}_{X_{2}}^{g}(f)caligraphic_T start_POSTSUBSCRIPT italic_X start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_g end_POSTSUPERSCRIPT ( italic_f ) =e16iLπf(1+e8iLπf)2absentsuperscript𝑒16𝑖𝐿𝜋𝑓superscript1superscript𝑒8𝑖𝐿𝜋𝑓2\displaystyle=e^{-16iL\pi f}(-1+e^{8iL\pi f})^{2}= italic_e start_POSTSUPERSCRIPT - 16 italic_i italic_L italic_π italic_f end_POSTSUPERSCRIPT ( - 1 + italic_e start_POSTSUPERSCRIPT 8 italic_i italic_L italic_π italic_f end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT (13a)
𝒯Y2g(f)superscriptsubscript𝒯subscript𝑌2𝑔𝑓\displaystyle\mathcal{T}_{Y_{2}}^{g}(f)caligraphic_T start_POSTSUBSCRIPT italic_Y start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_g end_POSTSUPERSCRIPT ( italic_f ) =8e8iLπfsin(4Lπf)sin(2Lπf)absent8superscript𝑒8𝑖𝐿𝜋𝑓4𝐿𝜋𝑓2𝐿𝜋𝑓\displaystyle=8e^{-8iL\pi f}\sin(4L\pi f)\sin(2L\pi f)= 8 italic_e start_POSTSUPERSCRIPT - 8 italic_i italic_L italic_π italic_f end_POSTSUPERSCRIPT roman_sin ( 4 italic_L italic_π italic_f ) roman_sin ( 2 italic_L italic_π italic_f ) (13b)
𝒯Z2g(f)superscriptsubscript𝒯subscript𝑍2𝑔𝑓\displaystyle\mathcal{T}_{Z_{2}}^{g}(f)caligraphic_T start_POSTSUBSCRIPT italic_Z start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_g end_POSTSUPERSCRIPT ( italic_f ) =0.absent0\displaystyle=0.= 0 . (13c)

Note that, in the case of multiple glitches happening at the same time in different test masses their effects sum up linearly. Expressions for the transfer functions for glitches occurring on the other test masses can be found in Appendix A.2.

III.4 Waveform model for Massive Black Hole Binaries

To model the full inspiral-merger-ringdown signal of a MBHB, we use the PhenomHM London et al. (2018) waveform model from the BBHx software package Katz et al. (2023). This is an analytic frequency-domain model for quasi-circular and spin-aligned binaries that includes harmonics beyond the dominant (2,2)22(2,2)( 2 , 2 ) mode. The contribution of these harmonics increases with the asymmetry of the system (i.e., unequal mass ratios) and the magnitude of the spins. We expect it to be possible to observe large mass ratios and highly spinning MBHBs with LISA Toubiana et al. (2021); Barausse et al. (2020a). Not including higher harmonics could result in severe biases Pitte et al. (2023). The GW signal of an MBHB, for the spin-aligned case, depends on 11 parameters θ={Mt,q,a1,a2,dL,φref,ι,λd,βd,ψ,tref}𝜃subscript𝑀𝑡𝑞subscript𝑎1subscript𝑎2subscript𝑑𝐿subscript𝜑𝑟𝑒𝑓𝜄subscript𝜆𝑑subscript𝛽𝑑𝜓subscript𝑡𝑟𝑒𝑓\theta=\{M_{t},q,a_{1},a_{2},d_{L},\varphi_{ref},\iota,\lambda_{d},\beta_{d},% \psi,t_{ref}\}italic_θ = { italic_M start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT , italic_q , italic_a start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_a start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT , italic_d start_POSTSUBSCRIPT italic_L end_POSTSUBSCRIPT , italic_φ start_POSTSUBSCRIPT italic_r italic_e italic_f end_POSTSUBSCRIPT , italic_ι , italic_λ start_POSTSUBSCRIPT italic_d end_POSTSUBSCRIPT , italic_β start_POSTSUBSCRIPT italic_d end_POSTSUBSCRIPT , italic_ψ , italic_t start_POSTSUBSCRIPT italic_r italic_e italic_f end_POSTSUBSCRIPT }. They can be split as:

  • Intrinsic Parameters:

    • m1subscript𝑚1m_{1}italic_m start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT is the mass of the first black hole and m2subscript𝑚2m_{2}italic_m start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT of the second, with m2m1subscript𝑚2subscript𝑚1m_{2}\leq m_{1}italic_m start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ≤ italic_m start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT.

    • a1,a2subscript𝑎1subscript𝑎2a_{1},a_{2}italic_a start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_a start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT: dimensionless spin magnitudes of the two black holes, aligned or anti-aligned with the orbital angular momentum.

  • Extrinsic Parameters:

    • dLsubscript𝑑𝐿d_{L}italic_d start_POSTSUBSCRIPT italic_L end_POSTSUBSCRIPT: luminosity distance to the source, scaling the amplitude of the GW signal.

    • φrefsubscript𝜑𝑟𝑒𝑓\varphi_{ref}italic_φ start_POSTSUBSCRIPT italic_r italic_e italic_f end_POSTSUBSCRIPT: reference phase at the merger.

    • ι𝜄\iotaitalic_ι: inclination, the angle between the orbital angular momentum and the line of sight, affecting the observed amplitude of higher-order modes.

    • λd,βdsubscript𝜆𝑑subscript𝛽𝑑\lambda_{d},\beta_{d}italic_λ start_POSTSUBSCRIPT italic_d end_POSTSUBSCRIPT , italic_β start_POSTSUBSCRIPT italic_d end_POSTSUBSCRIPT: ecliptic longitude and latitude, describing the sky location of the source in the (LISA) detector reference frame.

    • ψ𝜓\psiitalic_ψ: polarization angle of the waveform, affecting the projection of the signal onto the detector.

    • trefsubscript𝑡𝑟𝑒𝑓t_{ref}italic_t start_POSTSUBSCRIPT italic_r italic_e italic_f end_POSTSUBSCRIPT: reference time marking a characteristic point in the binary coalescence, such as the merger.

All extrinsic parameters discussed here are defined relative to the LISA constellation reference frame. We also introduce the total mass Mt=m1+m2subscript𝑀𝑡subscript𝑚1subscript𝑚2M_{t}=m_{1}+m_{2}italic_M start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT = italic_m start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT + italic_m start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT and the mass ratio q=m2/m11𝑞subscript𝑚2subscript𝑚11q=m_{2}/m_{1}\leq 1italic_q = italic_m start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT / italic_m start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ≤ 1.

The frequency-domain, source-frame waveform is expressed as Marsat and Baker (2018); Marsat et al. (2021):

h~lm(f)=Alm(f)eiϕlm(f),subscript~𝑙𝑚𝑓subscript𝐴𝑙𝑚𝑓superscript𝑒𝑖subscriptitalic-ϕ𝑙𝑚𝑓\tilde{h}_{lm}(f)=A_{lm}(f)e^{-i\phi_{lm}(f)},over~ start_ARG italic_h end_ARG start_POSTSUBSCRIPT italic_l italic_m end_POSTSUBSCRIPT ( italic_f ) = italic_A start_POSTSUBSCRIPT italic_l italic_m end_POSTSUBSCRIPT ( italic_f ) italic_e start_POSTSUPERSCRIPT - italic_i italic_ϕ start_POSTSUBSCRIPT italic_l italic_m end_POSTSUBSCRIPT ( italic_f ) end_POSTSUPERSCRIPT , (14)

where (l,m)𝑙𝑚(l,m)( italic_l , italic_m ) represent the harmonic mode indices. The TDI variables are obtained as

h~X,Y,Z(f)=lm𝒯X,Y,Zh(f,tlm(f))h~lm(f)dL.subscript~𝑋𝑌𝑍𝑓subscript𝑙𝑚subscriptsuperscript𝒯𝑋𝑌𝑍𝑓subscript𝑡𝑙𝑚𝑓subscript~𝑙𝑚𝑓subscript𝑑𝐿\tilde{h}_{X,Y,Z}(f)=\sum_{lm}\mathcal{T}^{h}_{X,Y,Z}(f,t_{lm}(f))\frac{\tilde% {h}_{lm}(f)}{d_{L}}.over~ start_ARG italic_h end_ARG start_POSTSUBSCRIPT italic_X , italic_Y , italic_Z end_POSTSUBSCRIPT ( italic_f ) = ∑ start_POSTSUBSCRIPT italic_l italic_m end_POSTSUBSCRIPT caligraphic_T start_POSTSUPERSCRIPT italic_h end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_X , italic_Y , italic_Z end_POSTSUBSCRIPT ( italic_f , italic_t start_POSTSUBSCRIPT italic_l italic_m end_POSTSUBSCRIPT ( italic_f ) ) divide start_ARG over~ start_ARG italic_h end_ARG start_POSTSUBSCRIPT italic_l italic_m end_POSTSUBSCRIPT ( italic_f ) end_ARG start_ARG italic_d start_POSTSUBSCRIPT italic_L end_POSTSUBSCRIPT end_ARG . (15)

where 𝒯h(f,tlm(f))superscript𝒯𝑓subscript𝑡𝑙𝑚𝑓\mathcal{T}^{h}(f,t_{lm}(f))caligraphic_T start_POSTSUPERSCRIPT italic_h end_POSTSUPERSCRIPT ( italic_f , italic_t start_POSTSUBSCRIPT italic_l italic_m end_POSTSUBSCRIPT ( italic_f ) ) are the frequency-domain transfer functions detailed in Marsat and Baker (2018); Marsat et al. (2021). The transfer functions depend on the extrinsic parameters and vary both in time and frequency due to the LISA constellation’s orbital motion. They are evaluated at time tlm(f)subscript𝑡𝑙𝑚𝑓t_{lm}(f)italic_t start_POSTSUBSCRIPT italic_l italic_m end_POSTSUBSCRIPT ( italic_f ), which is given by the stationary-phase approximation:

tlm(f)=tref12πdϕlm(f)df.subscript𝑡𝑙𝑚𝑓subscript𝑡𝑟𝑒𝑓12𝜋𝑑subscriptitalic-ϕ𝑙𝑚𝑓𝑑𝑓t_{lm}(f)=t_{ref}-\frac{1}{2\pi}\frac{d\phi_{lm}(f)}{df}.italic_t start_POSTSUBSCRIPT italic_l italic_m end_POSTSUBSCRIPT ( italic_f ) = italic_t start_POSTSUBSCRIPT italic_r italic_e italic_f end_POSTSUBSCRIPT - divide start_ARG 1 end_ARG start_ARG 2 italic_π end_ARG divide start_ARG italic_d italic_ϕ start_POSTSUBSCRIPT italic_l italic_m end_POSTSUBSCRIPT ( italic_f ) end_ARG start_ARG italic_d italic_f end_ARG . (16)

The phase-meter output (LISA observables) will be therefore the sum of GWs signals, h~X,Y,Z(f)subscript~𝑋𝑌𝑍𝑓\tilde{h}_{X,Y,Z}(f)over~ start_ARG italic_h end_ARG start_POSTSUBSCRIPT italic_X , italic_Y , italic_Z end_POSTSUBSCRIPT ( italic_f ), and glitches, g~X,Y,Z(f)subscript~𝑔𝑋𝑌𝑍𝑓\tilde{g}_{X,Y,Z}(f)over~ start_ARG italic_g end_ARG start_POSTSUBSCRIPT italic_X , italic_Y , italic_Z end_POSTSUBSCRIPT ( italic_f ), propagated through TDI.

III.5 Noise models and definition of the signal to noise ratio

In this work, we assume that primary noise sources such as laser frequency noise, spacecraft longitudinal jitters, clock noises and tilt-to-length couplings have been corrected and only leave negligable residuals in the post-processed TDI variables Hartwig and Bayle (2021); Hartig et al. (2022); Chwalla et al. (2020). What remains are the so-called secondary noises. These remaining noises Quang Nam et al. (2023a) fall into two broad categories: residual acceleration noise of each TM and the OMS noise terms limiting the test-mass position readout. Noises inside these categories can have various physical origins, different levels and different transfer functions into the final TDI variables Quang Nam et al. (2023b). For the analysis that will follow in section V, we just consider two main components modeled as uncorrelated noise on each test-mass and uncorrelated noise in each inter-spacecraft optical link, with identical statistical properties for noises of the same kind. We will instead consider a slightly more elaborate noise model for analysing the “Light-Spritz” data set, in section VII.2.

We characterize these noise sources by their PSD and a corresponding transfer function that describes its propagation in the TDI variables. The TM acceleration noise PSD is:

STM(f)=(NTM)2(1+(0.4×103f)2)×(1+(f8×103)4),subscript𝑆𝑇𝑀𝑓superscriptsubscript𝑁𝑇𝑀21superscript0.4superscript103𝑓21superscript𝑓8superscript1034\begin{split}S_{TM}(f)=(N_{TM})^{2}&\left(1+\left(\frac{0.4\times 10^{-3}}{f}% \right)^{2}\right)\\ \times&\left(1+\left(\frac{f}{8\times 10^{-3}}\right)^{4}\right),\end{split}start_ROW start_CELL italic_S start_POSTSUBSCRIPT italic_T italic_M end_POSTSUBSCRIPT ( italic_f ) = ( italic_N start_POSTSUBSCRIPT italic_T italic_M end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_CELL start_CELL ( 1 + ( divide start_ARG 0.4 × 10 start_POSTSUPERSCRIPT - 3 end_POSTSUPERSCRIPT end_ARG start_ARG italic_f end_ARG ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ) end_CELL end_ROW start_ROW start_CELL × end_CELL start_CELL ( 1 + ( divide start_ARG italic_f end_ARG start_ARG 8 × 10 start_POSTSUPERSCRIPT - 3 end_POSTSUPERSCRIPT end_ARG ) start_POSTSUPERSCRIPT 4 end_POSTSUPERSCRIPT ) , end_CELL end_ROW (17)

with NTM=2.4 fm/s/2HzN_{TM}=$2.4\text{\,}\mathrm{f}\mathrm{m}\mathrm{/}\mathrm{s}\mathrm{{}^{2}}% \mathrm{/}\sqrt{\mathrm{Hz}}$italic_N start_POSTSUBSCRIPT italic_T italic_M end_POSTSUBSCRIPT = start_ARG 2.4 end_ARG start_ARG times end_ARG start_ARG roman_fm / roman_s start_FLOATSUPERSCRIPT 2 end_FLOATSUPERSCRIPT / square-root start_ARG roman_Hz end_ARG end_ARG. The PSD of the displacement noise from the OMS is:

Soms(f)=(NOMS)2(1+(2.0×103f)4),subscript𝑆𝑜𝑚𝑠𝑓superscriptsubscript𝑁𝑂𝑀𝑆21superscript2.0superscript103𝑓4S_{oms}(f)=(N_{OMS})^{2}\left(1+\left(\frac{2.0\times 10^{-3}}{f}\right)^{4}% \right),italic_S start_POSTSUBSCRIPT italic_o italic_m italic_s end_POSTSUBSCRIPT ( italic_f ) = ( italic_N start_POSTSUBSCRIPT italic_O italic_M italic_S end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ( 1 + ( divide start_ARG 2.0 × 10 start_POSTSUPERSCRIPT - 3 end_POSTSUPERSCRIPT end_ARG start_ARG italic_f end_ARG ) start_POSTSUPERSCRIPT 4 end_POSTSUPERSCRIPT ) , (18)

with NOMS=7.9 pm/Hzsubscript𝑁𝑂𝑀𝑆times7.9pmHzN_{OMS}=$7.9\text{\,}\mathrm{p}\mathrm{m}\mathrm{/}\sqrt{\mathrm{Hz}}$italic_N start_POSTSUBSCRIPT italic_O italic_M italic_S end_POSTSUBSCRIPT = start_ARG 7.9 end_ARG start_ARG times end_ARG start_ARG roman_pm / square-root start_ARG roman_Hz end_ARG end_ARG. This overall OMS noise level is based on the values provided in the LDC tools working group (2022).

Note that the noise parameters, γ={NTM,NOMS}𝛾subscript𝑁𝑇𝑀subscript𝑁𝑂𝑀𝑆\gamma=\{N_{TM},N_{OMS}\}italic_γ = { italic_N start_POSTSUBSCRIPT italic_T italic_M end_POSTSUBSCRIPT , italic_N start_POSTSUBSCRIPT italic_O italic_M italic_S end_POSTSUBSCRIPT }, are expressed as equivalent acceleration noise in m/2s4/Hz\mathrm{m}\mathrm{{}^{2}}\mathrm{/}\mathrm{s}^{4}\mathrm{/}\mathrm{Hz}roman_m start_FLOATSUPERSCRIPT 2 end_FLOATSUPERSCRIPT / roman_s start_POSTSUPERSCRIPT 4 end_POSTSUPERSCRIPT / roman_Hz or displacement noise in m2 Hz1timesmeter2hertz1{\mathrm{m}}^{2}\text{\,}{\mathrm{Hz}}^{-1}start_ARG power start_ARG roman_m end_ARG start_ARG 2 end_ARG end_ARG start_ARG times end_ARG start_ARG power start_ARG roman_Hz end_ARG start_ARG - 1 end_ARG end_ARG. For our analysis, we instead use units of fractional frequency deviations, to which we will convert them by multiplying the components with Caccffd=(1/(c2πf))2superscriptsubscript𝐶accffdsuperscript1𝑐2𝜋𝑓2C_{\mathrm{acc}}^{\mathrm{ffd}}=\left({1}/{(c2\pi f)}\right)^{2}italic_C start_POSTSUBSCRIPT roman_acc end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_ffd end_POSTSUPERSCRIPT = ( 1 / ( italic_c 2 italic_π italic_f ) ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT or Cdispffd=(2πf/c)2superscriptsubscript𝐶dispffdsuperscript2𝜋𝑓𝑐2C_{\mathrm{disp}}^{\mathrm{ffd}}=\left({2\pi f}/{c}\right)^{2}italic_C start_POSTSUBSCRIPT roman_disp end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_ffd end_POSTSUPERSCRIPT = ( 2 italic_π italic_f / italic_c ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT, respectively. Here, c𝑐citalic_c is the speed of light in m s1timesmetersecond1\mathrm{m}\text{\,}{\mathrm{s}}^{-1}start_ARG roman_m end_ARG start_ARG times end_ARG start_ARG power start_ARG roman_s end_ARG start_ARG - 1 end_ARG end_ARG.

For the first generation TDI, the noise in the A,E𝐴𝐸A,Eitalic_A , italic_E and T𝑇Titalic_T channels are:

SA(f)=subscript𝑆𝐴𝑓absent\displaystyle S_{A}(f)=italic_S start_POSTSUBSCRIPT italic_A end_POSTSUBSCRIPT ( italic_f ) = SE(f)=8sin2(2πLf)[\displaystyle S_{E}(f)=8\sin^{2}(2\pi Lf)\Big{[}italic_S start_POSTSUBSCRIPT italic_E end_POSTSUBSCRIPT ( italic_f ) = 8 roman_sin start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ( 2 italic_π italic_L italic_f ) [
2CaccffdSTM(3+2cos(2πLf)+cos(4πLf))2superscriptsubscript𝐶accffdsubscript𝑆𝑇𝑀322𝜋𝐿𝑓4𝜋𝐿𝑓\displaystyle 2C_{\mathrm{acc}}^{\mathrm{ffd}}S_{TM}(3+2\cos(2\pi Lf)+\cos(4% \pi Lf))2 italic_C start_POSTSUBSCRIPT roman_acc end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_ffd end_POSTSUPERSCRIPT italic_S start_POSTSUBSCRIPT italic_T italic_M end_POSTSUBSCRIPT ( 3 + 2 roman_cos ( 2 italic_π italic_L italic_f ) + roman_cos ( 4 italic_π italic_L italic_f ) ) (19)
+CdispffdSoms(2+cos(2πLf))],\displaystyle+C_{\mathrm{disp}}^{\mathrm{ffd}}S_{\text{oms}}(2+\cos(2\pi Lf))% \Big{]},+ italic_C start_POSTSUBSCRIPT roman_disp end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_ffd end_POSTSUPERSCRIPT italic_S start_POSTSUBSCRIPT oms end_POSTSUBSCRIPT ( 2 + roman_cos ( 2 italic_π italic_L italic_f ) ) ] ,

and:

ST(f)=subscript𝑆𝑇𝑓absent\displaystyle S_{T}(f)=italic_S start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT ( italic_f ) = 16CdispffdSoms(1cos(2πLf))sin2(2πLf)16superscriptsubscript𝐶dispffdsubscript𝑆oms12𝜋𝐿𝑓superscript22𝜋𝐿𝑓\displaystyle 16C_{\mathrm{disp}}^{\mathrm{ffd}}S_{\text{oms}}(1-\cos(2\pi Lf)% )\sin^{2}(2\pi Lf)16 italic_C start_POSTSUBSCRIPT roman_disp end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_ffd end_POSTSUPERSCRIPT italic_S start_POSTSUBSCRIPT oms end_POSTSUBSCRIPT ( 1 - roman_cos ( 2 italic_π italic_L italic_f ) ) roman_sin start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ( 2 italic_π italic_L italic_f )
+128CaccffdSTMsin2(2πLf)sin4(πLf).128superscriptsubscript𝐶accffdsubscript𝑆𝑇𝑀superscript22𝜋𝐿𝑓superscript4𝜋𝐿𝑓\displaystyle+128C_{\mathrm{acc}}^{\mathrm{ffd}}S_{TM}\sin^{2}(2\pi Lf)\sin^{4% }\left(\pi Lf\right).+ 128 italic_C start_POSTSUBSCRIPT roman_acc end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_ffd end_POSTSUPERSCRIPT italic_S start_POSTSUBSCRIPT italic_T italic_M end_POSTSUBSCRIPT roman_sin start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ( 2 italic_π italic_L italic_f ) roman_sin start_POSTSUPERSCRIPT 4 end_POSTSUPERSCRIPT ( italic_π italic_L italic_f ) . (20)

Where, again, L𝐿Litalic_L denotes the LISA arm length in seconds.

III.5.1 Definition of the signal to noise ratio

The SNRs of glitches and MBHB signals are computed based on the three TDI channels using the following formula (Eq. 119 in Tinto and Dhurandhar (2021) and Babak et al. (2021)):

SNR=𝑆𝑁𝑅absent\displaystyle SNR=italic_S italic_N italic_R = (4[0fmax|Q~(f)|2(|𝒯~A(f)|2SA(f)+|𝒯~E(f)|2SE(f)\displaystyle\Big{(}4\Re\Big{[}\int_{0}^{f_{\text{max}}}|\tilde{Q}(f)|^{2}\Big% {(}\frac{|\mathcal{\tilde{T}}_{A}(f)|^{2}}{S_{A}(f)}+\frac{|\mathcal{\tilde{T}% }_{E}(f)|^{2}}{S_{E}(f)}( 4 roman_ℜ [ ∫ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_f start_POSTSUBSCRIPT max end_POSTSUBSCRIPT end_POSTSUPERSCRIPT | over~ start_ARG italic_Q end_ARG ( italic_f ) | start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ( divide start_ARG | over~ start_ARG caligraphic_T end_ARG start_POSTSUBSCRIPT italic_A end_POSTSUBSCRIPT ( italic_f ) | start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG start_ARG italic_S start_POSTSUBSCRIPT italic_A end_POSTSUBSCRIPT ( italic_f ) end_ARG + divide start_ARG | over~ start_ARG caligraphic_T end_ARG start_POSTSUBSCRIPT italic_E end_POSTSUBSCRIPT ( italic_f ) | start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG start_ARG italic_S start_POSTSUBSCRIPT italic_E end_POSTSUBSCRIPT ( italic_f ) end_ARG
+|𝒯~T(f)|2ST(f))df])1/2.\displaystyle\quad+\frac{|\mathcal{\tilde{T}}_{T}(f)|^{2}}{S_{T}(f)}\Big{)}\,% \mathrm{d}f\Big{]}\Big{)}^{1/2}.+ divide start_ARG | over~ start_ARG caligraphic_T end_ARG start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT ( italic_f ) | start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG start_ARG italic_S start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT ( italic_f ) end_ARG ) roman_d italic_f ] ) start_POSTSUPERSCRIPT 1 / 2 end_POSTSUPERSCRIPT . (21)

Note that the above formula is valid in this case as we assume the three TDI channels to be uncorrelated. Here 𝒯~A/E/T(f)subscript~𝒯𝐴𝐸𝑇𝑓\mathcal{\tilde{T}}_{A/E/T}(f)over~ start_ARG caligraphic_T end_ARG start_POSTSUBSCRIPT italic_A / italic_E / italic_T end_POSTSUBSCRIPT ( italic_f ) are the transfer functions of a GW signal or a glitch into the combinations A𝐴Aitalic_A, E𝐸Eitalic_E, and T𝑇Titalic_T and Q~(f)~𝑄𝑓\tilde{Q}(f)over~ start_ARG italic_Q end_ARG ( italic_f ) is the amplitude of the GW signal (h~lm(f)/dLsubscript~𝑙𝑚𝑓subscript𝑑𝐿\tilde{h}_{lm}(f)/d_{L}over~ start_ARG italic_h end_ARG start_POSTSUBSCRIPT italic_l italic_m end_POSTSUBSCRIPT ( italic_f ) / italic_d start_POSTSUBSCRIPT italic_L end_POSTSUBSCRIPT) or glitch (Δν~g(f)/ν0Δsubscript~𝜈𝑔𝑓subscript𝜈0\Delta\tilde{\nu}_{g}(f)/\nu_{0}roman_Δ over~ start_ARG italic_ν end_ARG start_POSTSUBSCRIPT italic_g end_POSTSUBSCRIPT ( italic_f ) / italic_ν start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT) in Fourier frequency domain, and SA/E/T(f)subscript𝑆𝐴𝐸𝑇𝑓S_{A/E/T}(f)italic_S start_POSTSUBSCRIPT italic_A / italic_E / italic_T end_POSTSUBSCRIPT ( italic_f ) are the spectral densities of the noises in the A𝐴Aitalic_A, E𝐸Eitalic_E, T𝑇Titalic_T combinations respectively. All these quantities have been defined in the previous sections.

IV Bayesian methodology and likelihood formulation

The Bayesian framework provides a powerful approach to the problem we aim to solve by treating the detection and parameter estimation of GW signals, instrumental artefacts and noise as an inference problem. In this context, the observed data, d(t)𝑑𝑡d(t)italic_d ( italic_t ), is modelled as the sum of contributions from the GW signal, h(t,θ)𝑡𝜃h(t,\theta)italic_h ( italic_t , italic_θ ), the glitch, g(t,ϕ)𝑔𝑡italic-ϕg(t,\phi)italic_g ( italic_t , italic_ϕ ), and the noise, n(t,γ)𝑛𝑡𝛾n(t,\gamma)italic_n ( italic_t , italic_γ ):

d(t,θ,ϕ,λ)=h(t,θ)+g(t,ϕ)+n(t,γ),𝑑𝑡𝜃italic-ϕ𝜆𝑡𝜃𝑔𝑡italic-ϕ𝑛𝑡𝛾d(t,\theta,\phi,\lambda)=h(t,\theta)+g(t,\phi)+n(t,\gamma),italic_d ( italic_t , italic_θ , italic_ϕ , italic_λ ) = italic_h ( italic_t , italic_θ ) + italic_g ( italic_t , italic_ϕ ) + italic_n ( italic_t , italic_γ ) , (22)

where θ,ϕ𝜃italic-ϕ\theta,\phiitalic_θ , italic_ϕ and γ𝛾\gammaitalic_γ represent the parameters of the GW signal, the glitch and the noise model, respectively. Denoting by μ=(θ,ϕ,γ)𝜇𝜃italic-ϕ𝛾\mu=(\theta,\phi,\gamma)italic_μ = ( italic_θ , italic_ϕ , italic_γ ) the set of all parameters, we write p(μ|d,M)𝑝conditional𝜇𝑑𝑀p(\mu|d,M)italic_p ( italic_μ | italic_d , italic_M ) the posterior distribution of the parameters. We note M𝑀Mitalic_M the assumed model, which includes M1subscript𝑀1M_{1}italic_M start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT, our chosen shapelet model for the glitches, M2subscript𝑀2M_{2}italic_M start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT, our model for the MBHB signal, and M3subscript𝑀3M_{3}italic_M start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT, our model for the instrumental noise. The posterior distribution p(μ|d,Mp(\mu|d,Mitalic_p ( italic_μ | italic_d , italic_M) is computed using Bayes’ theorem Sivia and Skilling (2006); Karnesis et al. (2023):

p(μ|d,M)=p(d|μ,M)p(μ|M)p(d|M).𝑝conditional𝜇𝑑𝑀𝑝conditional𝑑𝜇𝑀𝑝conditional𝜇𝑀𝑝conditional𝑑𝑀p(\mu|d,M)=\frac{p(d|\mu,M)\,p(\mu|M)}{p(d|M)}.italic_p ( italic_μ | italic_d , italic_M ) = divide start_ARG italic_p ( italic_d | italic_μ , italic_M ) italic_p ( italic_μ | italic_M ) end_ARG start_ARG italic_p ( italic_d | italic_M ) end_ARG . (23)

where:

  • p(d|μ,M)𝑝conditional𝑑𝜇𝑀p(d|\mu,M)italic_p ( italic_d | italic_μ , italic_M ) is the likelihood, quantifying the probability of observing data d𝑑ditalic_d under the models M1subscript𝑀1M_{1}italic_M start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT, M2subscript𝑀2M_{2}italic_M start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT and M3subscript𝑀3M_{3}italic_M start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT with parameters θ,ϕ𝜃italic-ϕ\theta,\phiitalic_θ , italic_ϕ, and γ𝛾\gammaitalic_γ;

  • p(μ|M)𝑝conditional𝜇𝑀p(\mu|M)italic_p ( italic_μ | italic_M ) is the prior distribution for the GW, glitch and noise parameters under the models M1subscript𝑀1M_{1}italic_M start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT, M2subscript𝑀2M_{2}italic_M start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT and M3subscript𝑀3M_{3}italic_M start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT: p(θ|M1)𝑝conditional𝜃subscript𝑀1p(\theta|M_{1})italic_p ( italic_θ | italic_M start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ), p(ϕ|M2)𝑝conditionalitalic-ϕsubscript𝑀2p(\phi|M_{2})italic_p ( italic_ϕ | italic_M start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) and p(γ|M3)𝑝conditional𝛾subscript𝑀3p(\gamma|M_{3})italic_p ( italic_γ | italic_M start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT );

  • p(d|M)𝑝conditional𝑑𝑀p(d|M)italic_p ( italic_d | italic_M ) is the evidence, which acts as a normalization constant for parameter estimation, but can also be used to compare different models M𝑀Mitalic_M.

For the problem at hand, we will use the Whittle likelihood, for which the log-likelihood in the frequency domain is:

log=12f[\displaystyle\log\mathcal{L}=-\frac{1}{2}\sum_{f}\Bigg{[}roman_log caligraphic_L = - divide start_ARG 1 end_ARG start_ARG 2 end_ARG ∑ start_POSTSUBSCRIPT italic_f end_POSTSUBSCRIPT [ 4Δf|d~(f)h~(f,θ)g~(f,ϕ)|2S(f,γ)4Δ𝑓superscript~𝑑𝑓~𝑓𝜃~𝑔𝑓italic-ϕ2𝑆𝑓𝛾\displaystyle 4\Delta f\frac{\left|\tilde{d}(f)-\tilde{h}(f,\theta)-\tilde{g}(% f,\phi)\right|^{2}}{S(f,\gamma)}4 roman_Δ italic_f divide start_ARG | over~ start_ARG italic_d end_ARG ( italic_f ) - over~ start_ARG italic_h end_ARG ( italic_f , italic_θ ) - over~ start_ARG italic_g end_ARG ( italic_f , italic_ϕ ) | start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG start_ARG italic_S ( italic_f , italic_γ ) end_ARG
+logS(f,γ)],\displaystyle+\log S(f,\gamma)\Bigg{]},+ roman_log italic_S ( italic_f , italic_γ ) ] , (24)

where d~(f),h~(f,θ)~𝑑𝑓~𝑓𝜃\tilde{d}(f),\tilde{h}(f,\theta)over~ start_ARG italic_d end_ARG ( italic_f ) , over~ start_ARG italic_h end_ARG ( italic_f , italic_θ ) and g~(f,ϕ)~𝑔𝑓italic-ϕ\tilde{g}(f,\phi)over~ start_ARG italic_g end_ARG ( italic_f , italic_ϕ ) are the Fourier transforms of the same quantities appearing in Eq. 22; S(f,γ)𝑆𝑓𝛾S(f,\gamma)italic_S ( italic_f , italic_γ ) is the one-sided noise PSD, parameterized by γ𝛾\gammaitalic_γ; and ΔfΔ𝑓\Delta froman_Δ italic_f is the frequency resolution which is equal to 1/Tobs=1/(Ndt)1subscript𝑇obs1𝑁𝑑𝑡1/T_{\rm obs}=1/(Ndt)1 / italic_T start_POSTSUBSCRIPT roman_obs end_POSTSUBSCRIPT = 1 / ( italic_N italic_d italic_t ), with Tobssubscript𝑇obsT_{\rm obs}italic_T start_POSTSUBSCRIPT roman_obs end_POSTSUBSCRIPT the observation time, N𝑁Nitalic_N the length of the time series and dt𝑑𝑡dtitalic_d italic_t the inverse of the sampling rate.

The posterior can be sampled using advanced Bayesian inference techniques such as Markov Chain Monte Carlo (MCMC) Metropolis et al. (1953); Hastings (1970); Hitchcock (2003); Gilks et al. (1995). In this analysis, we wish to fit simultaneously an unknown number of glitches. Therefore, we use a Reversible Jump MCMC (RJ-MCMC) Green (1995) algorithm, a generalization of the Metropolis–Hastings algorithm that allows to vary the dimensionsality of the parameter space while sampling over the parameters. We use the RJ-MCMC implementation of Eryn Karnesis et al. (2023). Several walkers are run in parallel and exchange information periodically in order to increase the sampling efficiency. We also use parallel tempering Swendsen and Wang (1986); Hukushima and Nemoto (1996); Vousden et al. (2015) to improve the exploration of multi-modalities in the posterior distribution. During the run, a new point μsuperscript𝜇\mu^{\prime}italic_μ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT is proposed from a point μ𝜇\muitalic_μ according to a proposal function, q(μ|μ)𝑞conditionalsuperscript𝜇𝜇q(\mu^{\prime}|\mu)italic_q ( italic_μ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT | italic_μ ), and is accepted with probability:

α=min(1,p(μ|d,M)q(μ|μ)p(μ|d,M)q(μ|μ)).𝛼1𝑝conditionalsuperscript𝜇𝑑𝑀𝑞conditional𝜇superscript𝜇𝑝conditional𝜇𝑑𝑀𝑞conditionalsuperscript𝜇𝜇\alpha=\min\left(1,\frac{p(\mu^{\prime}|d,M)\,q(\mu|\mu^{\prime})}{p(\mu|d,M)% \,q(\mu^{\prime}|\mu)}\right).italic_α = roman_min ( 1 , divide start_ARG italic_p ( italic_μ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT | italic_d , italic_M ) italic_q ( italic_μ | italic_μ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) end_ARG start_ARG italic_p ( italic_μ | italic_d , italic_M ) italic_q ( italic_μ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT | italic_μ ) end_ARG ) . (25)

In our analysis, we use different proposals for the parameters of the MBHB (θ𝜃\thetaitalic_θ), of the intrumental noise (γ𝛾\gammaitalic_γ) and of the glitches (ϕitalic-ϕ\phiitalic_ϕ). They are updated successively, using the Gibbs sampling implementation of Eryn Karnesis et al. (2023). In the Eryn ‘language’, branches represent different types of models used to fit the data. Each branch contains leaves, which correspond to individual instances of each model. In our case, leaves represent individual glitches, massive black hole binaries, or noise components, which form different branches Karnesis et al. (2023).

IV.1 Moves

In an RJ-MCMC sampler, there are two types of moves:

  • In-model moves: changes are made to the parameters, keeping the number of leaves in each branch fixed.

  • Out-of-model moves: the dimensionality is varied by adding or removing leaves, in one or several branches. For instance, in our case, adding or removing glitches.

In this work, only the number of glitches is varied, so that the latter type of moves applies only to their parameters. Detailed explanations of the various moves available in Eryn, as well as the most commonly used ones in the literature, can be found in Katz ; Karnesis et al. (2023). Here, we focus on the in-model moves specifically developed for glitch searches, MBHB parameter estimation, and noise parameter estimation. For the out-of-model move for glitches, we propose new parameters from the prior distribution during the search phase. Instead, during the final parameter estimation phase, as explained in the upcoming section, we propose sampling from a Gaussian distribution, whose covariance matrix is constructed from an initial estimate of the parameters of the detected glitches during the search phase.

IV.1.1 Glitch search and parameter estimation (in-model moves)

Group stretch move

In the stretch move new samples are generated by leveraging the distribution of points in the ensemble, using the formula:

Xk=Xj+z(XkXj),subscript𝑋𝑘subscript𝑋𝑗𝑧subscript𝑋𝑘subscript𝑋𝑗X_{k}=X_{j}+z(X_{k}-X_{j}),italic_X start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT = italic_X start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT + italic_z ( italic_X start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT - italic_X start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ) , (26)

where Xksubscript𝑋𝑘X_{k}italic_X start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT is the current sample, Xjsubscript𝑋𝑗X_{j}italic_X start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT is another point in the ensemble, and z𝑧zitalic_z is a scaling factor drawn from a predefined distribution. In RJ-MCMC, however, the variable dimensionality of models complicates the direct use of ensemble-based proposals. To deal with the variable dimensionality of models the Group Stretch Move is used Karnesis et al. (2023) which builds upon the affine-invariant stretch proposal Goodman and Weare (2010). This group introduces ‘friends’, a static subset of reference points that are defined after a first burn of the sampler and as soon as the group move is used. These friends simulate the posterior distribution for a given model and remain fixed throughout multiple iterations. For instance, in our case, we group the estimated glitches into ‘friend groups’ according to their mean time value. The goal is to set Xjsubscript𝑋𝑗X_{j}italic_X start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT properly for each posterior mode such that Xjsubscript𝑋𝑗X_{j}italic_X start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT and Xksubscript𝑋𝑘X_{k}italic_X start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT are always found on the same mode of the posterior. This approach improves efficiency by ensuring that proposed points correspond to the posterior distribution of the currently sampled glitch, rather than a different one.
The stretch move is used 50%percent\%% of the time, and in the implementation of this move, the set of ‘friends’ associated with each glitch is tracked, to improve the efficiency of the RJ-MCMC to add and remove glitches Karnesis et al. (2023); Katz and Roberts (2023).

Selected Covariance Group Move

The ‘Selected Covariance Group Move’ replaces the standard Gaussian move. In the context of nested models, the posterior distribution may display distinct modes, corresponding to different glitches in the data. This complexity makes it challenging to efficiently tune covariance matrices for each mode. The core idea behind this move is to group glitches based on their mean injection time values and compute corresponding distributions for glitch parameters using a Gaussian Mixture Model (GMM). This approach leverages covariance matrices specific to each glitch group. When glitches are clearly distinguishable, the group proposal becomes highly efficient, aided by the branch supplemental object in Eryn Katz et al. (2023) to maintain the association between glitches and their respective covariance matrices444The branch supplemental object tracks the friends’ information associated with each glitch. This is particularly needed in the presence of RJ-MCMC and parallel tempering, where we have swaps in temperature or birth/death of new glitches.. Note that we leverage the “fix friends” function from Eryn when a proposed birth of a new glitch is accepted. This function assigns a covariance matrix (or friends) to the new glitch from the precomputed friends group.

The computation(-update) of the covariance matrices for this move is done as follows:

  1. 1.

    Cluster specification: every 500 iterations, the fraction of samples with zero leaves is computed with respect to the total number of leaves Karnesis et al. (2023). If this fraction is less than 0.5, then a GMM is initialised with several modes equal to the maximum allowed number of glitch leaves, allowing it to model complex distributions as a combination of multiple Gaussian distributions. We assume that each component has a distinct, full-rank covariance matrix. So each distinct glitch-cluster estimated from the samples corresponds to a potential glitch, with the total number of possible clusters constrained by the maximum number of glitches allowed by the problem’s dimensionality.

  2. 2.

    Covariance Calculation: based on the identified number of clusters, a GMM is fitted to the data. The GMM provides the mean and covariance for each cluster, capturing the local structure of the posterior around each glitch555In a GMM, each Gaussian component does not necessarily correspond to an independent cluster. If a posterior distribution is not very Gaussian in shape, more than one component may be needed to approximate it. Therefore, identifying a GMM component is not the same as identifying a true cluster represented by the posterior of a single glitch. However, for the problem analyzed here, we have verified that the glitch posteriors are sufficiently Gaussian and well-separated, so our assumptions remain valid. Dividing a true cluster into more sub-clusters than are present in the data leads to proposal steps that are smaller than necessary. This can result in a high acceptance rate but a low effective sample size, meaning the sampler requires more steps to converge. To improve sampling efficiency, this proposal method should be refined in future work. One development we have already begun testing is using x-means instead of k-means to automatically determine the optimal number of GMM components. . These parameters are then utilized in the in-model move, enabling it to propose moves that are well-aligned with the posterior’s structure. This approach ensures the sampler’s efficiency, particularly when the shape of the posterior varies significantly between glitches, warranting the use of distinct covariance matrices for each glitch. By iteratively estimating the covariance matrices from the samples via a GMM, and adapting the sampler to the posterior structure, this methodology establishes a robust framework to be used in case of glitch searches. Note that, adaptation is applied for the first 50000 iterations and then stopped to preserve detailed balance. The covariance group move is used 50%percent\%% of the time.

IV.1.2 Glitch search and parameter estimation (out-of model moves)

During the parameter estimation phase, we incorporate a move proposal mechanism based on detected glitches. Specifically, 70%percent7070\%70 % of the time, the glitch parameters are drawn from a mixture of independent Gaussian distributions describing the shapelet parameter τ0subscript𝜏0\tau_{0}italic_τ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT, ln(A0)𝑙𝑛subscript𝐴0ln(A_{0})italic_l italic_n ( italic_A start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) and β0subscript𝛽0\beta_{0}italic_β start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT. Each parameter is sampled from a multivariate normal distribution with the corresponding mean and covariance of the previously detected shapelets. We propose from the prior distribution the other 30%percent3030\%30 % of the time.

IV.1.3 Noise parameter estimation (in-model moves)

We consider both the stretch and Gaussian proposal moves Katz . Initially, the Gaussian move uses a diagonal covariance matrix with small entries (following a suggestion from the Eryn tutorial). Then, the covariance matrix is re-estimated using the noise samples every 500 iterations and the adaptation is done for the first 50000 iterations to preserve detailed balance. The stretch move is used 85%percent\%% of the time while the Gaussian move 15%percent\%%.

IV.1.4 Massive black hole binaries parameter estimation (in-model moves)

Following Ref. Katz (2022), we use a stretch proposal described above, which is used for approximately 80%percent8080\%80 % of proposals, and a sky-mode hopping proposal. We also add a Gaussian move proposal used 5%percent55\%5 % of the time with a small covariance matrix. This matrix is then recursively estimated from the MBHB samples with the same cadence used to estimate the noise covariance matrix. We notice that this additional move helps the convergence of the algorithm in possible cases where Xksubscript𝑋𝑘X_{k}italic_X start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT and Xjsubscript𝑋𝑗X_{j}italic_X start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT (Eq. 26) are very close in value as they will spread out slowly otherwise if we only use the stretch move. Instead, the covariance Gaussian move will speed this up.

Regarding the sky-mode proposal, when sampling in the LISA reference frame Marsat et al. (2021), as we do here, there exist eight distinct sky modes in terms of the extrinsic parameters. These consist of four longitudinal modes at:

{λd+(0,1,2,3)π/2,ψ+(0,1,2,3)π/2}subscript𝜆𝑑0123𝜋2𝜓0123𝜋2\{\lambda_{d}+(0,1,2,3)\pi/2,\psi+(0,1,2,3)\pi/2\}{ italic_λ start_POSTSUBSCRIPT italic_d end_POSTSUBSCRIPT + ( 0 , 1 , 2 , 3 ) italic_π / 2 , italic_ψ + ( 0 , 1 , 2 , 3 ) italic_π / 2 }

and two latitudinal modes:

{±βd,±cosι,±cosψ}.plus-or-minussubscript𝛽𝑑plus-or-minus𝜄plus-or-minus𝜓\{\pm\beta_{d},\pm\cos\iota,\pm\cos\psi\}.{ ± italic_β start_POSTSUBSCRIPT italic_d end_POSTSUBSCRIPT , ± roman_cos italic_ι , ± roman_cos italic_ψ } .

Three varieties of this proposal are used:

  • One is to move the walker to a mode drawn randomly with replacement from all eight sky modes. This is used for approximately 4%percent44\%4 % of proposals.

  • Proposals to change just the longitudinal or latitudinal mode are used approximately 3%percent33\%3 % and 8%percent88\%8 % of the time, respectively.

The relative percentages of the sky-mode hopping proposals reflect that, at higher frequencies, the eight modes reduce to just the two latitudinal modes at the proper ecliptic longitude. In this situation, longitudinal mode-hopping proposals are not efficient.

V Testing the reversible jump with simulated data

To test our pipeline, we generate mock data by injecting glitches—extracted from the LPF catalogue Armano et al. (2022b) and shown in Fig. 1—into a noise-only dataset. For this test, we do not include any astrophysical signals, as our goal is to investigate the ability of the RJ-MCMC to identify glitches of varying shapes and SNRs, and to distinguish them from noise in order to establish an approximate threshold below which such glitches are consistent with instrumental noise. In Section VI, we also quantify the impact of three fiducial glitches—with low, medium, and high SNRs of 21, 72, and 1544, respectively—on the parameter estimation of an MBHB.

V.1 Data generation

We generate 30 days of LISA data where seven glitches are injected according to the amplitudes and damping times extracted from the LPF catalogue Armano et al. (2022b), and are randomly distributed over these 30 days. In LPF the glitches were observed to have a Poisson distribution with about one glitch per test mass per day. We use a slightly lower rate here to simplify the computational burden as we do not believe this will affect our results qualitatively. Moreover, we allow also for glitches injected very close in time to show the ability of the methodology to distinguish between such glitches. Table 1 reports the parameters of the injected glitches as well as the corresponding number indicating the position of the glitch in the LPF catalogue (the run index is in accordance with LPF numbering).
We model the injected glitches using the single exponential model reported in section III Eq. III.2, but we use shapelets with n0=1subscript𝑛01n_{0}=1italic_n start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT = 1 to fit for them.

Refer to caption
Figure 3: Amplitude spectral density of the TDI channels A,E𝐴𝐸A,Eitalic_A , italic_E and T𝑇Titalic_T when seven glitches are injected on TM12𝑇subscript𝑀12TM_{12}italic_T italic_M start_POSTSUBSCRIPT 12 end_POSTSUBSCRIPT at different times.

We simulate LISA noise directly in the frequency domain following Muratore et al. (2024). Assuming that the real-time series n(t)𝑛𝑡n(t)italic_n ( italic_t ) follows a stationary, zero-mean, Gaussian and ergodic random process, the Fourier transform of the noise, n~(f)~𝑛𝑓\tilde{n}(f)over~ start_ARG italic_n end_ARG ( italic_f ) is normally distributed with zero-mean and variance given by the one-sided power spectral density:

<n~(f)n~(f)>expectationsuperscript~𝑛superscript𝑓~𝑛𝑓\displaystyle<\tilde{n}^{*}(f^{\prime})\tilde{n}(f)>< over~ start_ARG italic_n end_ARG start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ( italic_f start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) over~ start_ARG italic_n end_ARG ( italic_f ) > =12Sn(f)δ(ff),absent12subscript𝑆𝑛𝑓𝛿𝑓superscript𝑓\displaystyle=\frac{1}{2}S_{n}(f)\delta(f-f^{\prime}),= divide start_ARG 1 end_ARG start_ARG 2 end_ARG italic_S start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ( italic_f ) italic_δ ( italic_f - italic_f start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) , (27a)
<n~(f)n~(f)>expectation~𝑛superscript𝑓~𝑛𝑓\displaystyle<\tilde{n}(f^{\prime})\tilde{n}(f)>< over~ start_ARG italic_n end_ARG ( italic_f start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) over~ start_ARG italic_n end_ARG ( italic_f ) > =0,absent0\displaystyle=0,= 0 , (27b)

where <>absent<>< > denotes the expectation value over the data-generating process. For a finite grid of frequencies, defining nk=n~(fk)subscript𝑛𝑘~𝑛subscript𝑓𝑘n_{k}=\tilde{n}(f_{k})italic_n start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT = over~ start_ARG italic_n end_ARG ( italic_f start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ), Eq. (27a) can be written as

<n~kn~j>expectationsubscriptsuperscript~𝑛𝑘subscript~𝑛𝑗\displaystyle<\tilde{n}^{*}_{k}\tilde{n}_{j}>< over~ start_ARG italic_n end_ARG start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT over~ start_ARG italic_n end_ARG start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT > =T2Sn(fk)δjk.absent𝑇2subscript𝑆𝑛subscript𝑓𝑘subscript𝛿𝑗𝑘\displaystyle=\frac{T}{2}S_{n}(f_{k})\delta_{jk}.= divide start_ARG italic_T end_ARG start_ARG 2 end_ARG italic_S start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ( italic_f start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ) italic_δ start_POSTSUBSCRIPT italic_j italic_k end_POSTSUBSCRIPT . (28)

Therefore the variance of the real and imaginary parts of the noise is given by (where the real and imaginary part are uncorrelated):

<[n~k]2>=<[n~k]2>=T4Sn(fk)=σk2.<\Re[\tilde{n}_{k}]^{2}>=<\Im[\tilde{n}_{k}]^{2}>=\frac{T}{4}S_{n}(f_{k})=% \sigma_{k}^{2}\,.< roman_ℜ [ over~ start_ARG italic_n end_ARG start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ] start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT > = < roman_ℑ [ over~ start_ARG italic_n end_ARG start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ] start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT > = divide start_ARG italic_T end_ARG start_ARG 4 end_ARG italic_S start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ( italic_f start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ) = italic_σ start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT . (29)

The noise in the Fourier domain is then generated as:

n~k=σk(Xreal+iXimag),subscript~𝑛𝑘subscript𝜎𝑘subscript𝑋real𝑖subscript𝑋imag\tilde{n}_{k}=\sigma_{k}\cdot\left(X_{\text{real}}+iX_{\text{imag}}\right),over~ start_ARG italic_n end_ARG start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT = italic_σ start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ⋅ ( italic_X start_POSTSUBSCRIPT real end_POSTSUBSCRIPT + italic_i italic_X start_POSTSUBSCRIPT imag end_POSTSUBSCRIPT ) , (30)

where Xreal𝒩(0,1)similar-tosubscript𝑋real𝒩01X_{\text{real}}\sim\mathcal{N}(0,1)italic_X start_POSTSUBSCRIPT real end_POSTSUBSCRIPT ∼ caligraphic_N ( 0 , 1 ) and Ximag𝒩(0,1)similar-tosubscript𝑋imag𝒩01X_{\text{imag}}\sim\mathcal{N}(0,1)italic_X start_POSTSUBSCRIPT imag end_POSTSUBSCRIPT ∼ caligraphic_N ( 0 , 1 ) are independent normally distributed random variables with mean 0 and variance 1.

For this example the noise PSD is given by the sum of the TM acceleration and OMS noises which follow the models described in section III.5. The assumed value for the TM acceleration noise is NTM2=(2.4×1015ms2Hz1/2)2superscriptsubscript𝑁𝑇𝑀2superscript2.4superscript1015msuperscripts2superscriptHz122N_{TM}^{2}=\left(2.4\times 10^{-15}~{}\textrm{m}\,\textrm{s}^{-2}\textrm{Hz}^{% -1/2}\right)^{2}italic_N start_POSTSUBSCRIPT italic_T italic_M end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT = ( 2.4 × 10 start_POSTSUPERSCRIPT - 15 end_POSTSUPERSCRIPT m s start_POSTSUPERSCRIPT - 2 end_POSTSUPERSCRIPT Hz start_POSTSUPERSCRIPT - 1 / 2 end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT and for the OMS noise NOMS2=(7.9pm/Hz)2superscriptsubscript𝑁𝑂𝑀𝑆2superscript7.9pmHz2N_{OMS}^{2}=\left(7.9~{}\textrm{pm}/\sqrt{\textrm{Hz}}\right)^{2}italic_N start_POSTSUBSCRIPT italic_O italic_M italic_S end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT = ( 7.9 pm / square-root start_ARG Hz end_ARG ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT.

Figure 3 shows the Amplitude Spectral Density (ASD) in the TDI channels resulting from the injected glitches. The channels A𝐴Aitalic_A and E𝐸Eitalic_E show an excess of power with respect to the reference noise model in nominal conditions, shown in black. This excess of power is mild in the null channel T𝑇Titalic_T, as anticipated in Muratore (2021). The reason is that for null channels, acceleration-glitches have a mild effect due to the OMS and TM acceleration noise transfer functions of these channels. In particular, the latter shows a suppression at low frequencies with respect to the OMS such that the OMS noise dominates for these channels at all frequencies Muratore et al. (2023).

Glitch 𝒕𝟎subscript𝒕0t_{0}bold_italic_t start_POSTSUBSCRIPT bold_0 end_POSTSUBSCRIPT [𝒔]delimited-[]𝒔[s]bold_[ bold_italic_s bold_] 𝐥𝐧(𝚫𝒗/[𝒎𝒔𝟐])𝚫𝒗delimited-[]𝒎superscript𝒔2\ln(\Delta v/[ms^{-2}])bold_ln bold_( bold_Δ bold_italic_v bold_/ bold_[ bold_italic_m bold_italic_s start_POSTSUPERSCRIPT bold_- bold_2 end_POSTSUPERSCRIPT bold_] bold_) 𝝉𝝉\taubold_italic_τ [𝒔]delimited-[]𝒔[s]bold_[ bold_italic_s bold_] SNR
#3 2.97×1052.97superscript1052.97\times 10^{5}2.97 × 10 start_POSTSUPERSCRIPT 5 end_POSTSUPERSCRIPT 25.525.5-25.5- 25.5 184.47184.47184.47184.47 72727272
#118 7.50×1057.50superscript1057.50\times 10^{5}7.50 × 10 start_POSTSUPERSCRIPT 5 end_POSTSUPERSCRIPT 25.525.5-25.5- 25.5 4.234.234.234.23 230230230230
#90 1.25×1061.25superscript1061.25\times 10^{6}1.25 × 10 start_POSTSUPERSCRIPT 6 end_POSTSUPERSCRIPT 24.424.4-24.4- 24.4 43.2143.2143.2143.21 479479479479
#72 1.75×1061.75superscript1061.75\times 10^{6}1.75 × 10 start_POSTSUPERSCRIPT 6 end_POSTSUPERSCRIPT 24.524.5-24.5- 24.5 1.861.861.861.86 637637637637
#179 2.20×1062.20superscript1062.20\times 10^{6}2.20 × 10 start_POSTSUPERSCRIPT 6 end_POSTSUPERSCRIPT 23.923.9-23.9- 23.9 4.774.774.774.77 1172117211721172
#248 2.25×1062.25superscript1062.25\times 10^{6}2.25 × 10 start_POSTSUPERSCRIPT 6 end_POSTSUPERSCRIPT 23.623.6-23.6- 23.6 13.9613.9613.9613.96 1544154415441544
#41 2.50×1062.50superscript1062.50\times 10^{6}2.50 × 10 start_POSTSUPERSCRIPT 6 end_POSTSUPERSCRIPT 28.328.3-28.3- 28.3 5.575.575.575.57 21212121
Table 1: Parameters t0subscript𝑡0t_{0}italic_t start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT, ln(Δv)Δ𝑣\ln(\Delta v)roman_ln ( roman_Δ italic_v ), τ𝜏\tauitalic_τ and SNR for each injected glitch. For each glitch we also indicate the corresponding number (##\##) from the LPF catalogue.

V.2 Data analysis

We analyze a complete 30 days dataset in a single run. Using the shapelet model with n0=1subscript𝑛01n_{0}=1italic_n start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT = 1, we fit for glitches allowing up to 10 glitches in the data, also including the possibility of no glitches. The noise shape is considered fixed, while the two amplitude parameters of the OMS and TM noise, NOMSsubscript𝑁𝑂𝑀𝑆N_{OMS}italic_N start_POSTSUBSCRIPT italic_O italic_M italic_S end_POSTSUBSCRIPT and NTMsubscript𝑁𝑇𝑀N_{TM}italic_N start_POSTSUBSCRIPT italic_T italic_M end_POSTSUBSCRIPT, vary within the uniform prior range specified in Table 2. We also use uniform priors for the time of injection (τ0subscript𝜏0\tau_{0}italic_τ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT) and the decay time (β0subscript𝛽0\beta_{0}italic_β start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT) of the shapelet, whereas for the amplitude (A0subscript𝐴0A_{0}italic_A start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT) of the shapelet we consider a uniform prior in natural log amplitude. In this way, we avoid sampling parameters with vastly different numerical scales—such as amplitudes on the order of 1012superscript101210^{-12}10 start_POSTSUPERSCRIPT - 12 end_POSTSUPERSCRIPT and injection times around 104106similar-tosuperscript104superscript10610^{4}\sim 10^{6}10 start_POSTSUPERSCRIPT 4 end_POSTSUPERSCRIPT ∼ 10 start_POSTSUPERSCRIPT 6 end_POSTSUPERSCRIPT. This choice tends to give more weight to low-SNR glitches. The goal here is to ensure that the sampler spends enough time in this region of parameter space, where most glitches occur. Figure 1 illustrates this relationship by showing the impulse, duration, and SNRs of the glitches in the LPF catalog. The analysis takes about 2 days using 70 walkers and 10 temperatures, running on a single A100 GPU. We expect that this timing can be improved in the future.

Parameter Lower Bound Upper Bound
τ0subscript𝜏0\tau_{0}italic_τ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT [s] 0 2592000
ln(A0/[ms2])𝑙𝑛subscript𝐴0delimited-[]𝑚superscript𝑠2ln(A_{0}/[ms^{-2}])italic_l italic_n ( italic_A start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT / [ italic_m italic_s start_POSTSUPERSCRIPT - 2 end_POSTSUPERSCRIPT ] ) -35 -20
β0subscript𝛽0\beta_{0}italic_β start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT [s] 1 1e5
NTMsubscript𝑁𝑇𝑀N_{TM}italic_N start_POSTSUBSCRIPT italic_T italic_M end_POSTSUBSCRIPT [ms2Hz1/2𝑚superscript𝑠2𝐻superscript𝑧12ms^{-2}Hz^{-1/2}italic_m italic_s start_POSTSUPERSCRIPT - 2 end_POSTSUPERSCRIPT italic_H italic_z start_POSTSUPERSCRIPT - 1 / 2 end_POSTSUPERSCRIPT] (7e-12) (8e-12)
NOMSsubscript𝑁𝑂𝑀𝑆N_{OMS}italic_N start_POSTSUBSCRIPT italic_O italic_M italic_S end_POSTSUBSCRIPT [mHz1/2𝑚𝐻superscript𝑧12mHz^{-1/2}italic_m italic_H italic_z start_POSTSUPERSCRIPT - 1 / 2 end_POSTSUPERSCRIPT] (2e-15) (3e-15)
Table 2: Table of priors used for shapelet and noise parameters.

Figure 4 reports the posterior for the parameters of the shapelets used to fit the injected glitches.

Refer to caption
Refer to caption
Figure 4: Upper plot: Posterior distributions of the shapelet parameters τ0subscript𝜏0\tau_{0}italic_τ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT, ln(A0)subscript𝐴0\ln(A_{0})roman_ln ( italic_A start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ), and β0subscript𝛽0\beta_{0}italic_β start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT (plotted as log(β)𝛽\log(\beta)roman_log ( italic_β ) for better visualisation) for the seven injected glitches, fitted using first-order shapelets. Lower plot: Zoom-in on the 2D posteriors of ln(A0)subscript𝐴0\ln(A_{0})roman_ln ( italic_A start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) and β0subscript𝛽0\beta_{0}italic_β start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT (plotted as log(β0)subscript𝛽0\log(\beta_{0})roman_log ( italic_β start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT )), overlaid with SNR contours of the injected glitches.

The parameters of the injected glitches (Table 1) — converted to shapelet parameters according to τ0=t0subscript𝜏0subscript𝑡0\tau_{0}=t_{0}italic_τ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT = italic_t start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT, A0=Δv2β0subscript𝐴0Δ𝑣2subscript𝛽0A_{0}=\frac{\Delta v}{2\beta_{0}}italic_A start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT = divide start_ARG roman_Δ italic_v end_ARG start_ARG 2 italic_β start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_ARG and β0=τsubscript𝛽0𝜏\beta_{0}=\tauitalic_β start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT = italic_τ — are indicated within the Figure.

Refer to caption
Figure 5: Posterior distribution of the amplitudes, NTMsubscript𝑁𝑇𝑀N_{TM}italic_N start_POSTSUBSCRIPT italic_T italic_M end_POSTSUBSCRIPT and NOMSsubscript𝑁𝑂𝑀𝑆N_{OMS}italic_N start_POSTSUBSCRIPT italic_O italic_M italic_S end_POSTSUBSCRIPT, for the test mass acceleration noise and optical metrology system noise
Refer to caption
Figure 6: Amplitude spectral density of the TDI A,E𝐴𝐸A,Eitalic_A , italic_E and T𝑇Titalic_T corresponding to random draws from the posterior distribution of the amplitudes noise parameters of test mass acceleration noise NTMsubscript𝑁𝑇𝑀N_{TM}italic_N start_POSTSUBSCRIPT italic_T italic_M end_POSTSUBSCRIPT, and optical metrology noise NOMSsubscript𝑁𝑂𝑀𝑆N_{OMS}italic_N start_POSTSUBSCRIPT italic_O italic_M italic_S end_POSTSUBSCRIPT, after glitch-fitting is performed

It can be observed in Fig. 4 that the parameters of the glitch with the lowest SNR (21) cannot be recovered, and that the posteriors of the two glitches that are close in time overlap with each other such that in the 1D plot we observe a fifth bigger Gaussian for the time of injection τ0subscript𝜏0\tau_{0}italic_τ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT. The failure to recover the low-SNR glitch is likely due to the fact that the search did not detect it, possibly because proposed points rarely fall within the correct region of parameter space. This issue is probably linked to the choice of a uniform prior on β0subscript𝛽0\beta_{0}italic_β start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT (along with its lower amplitude), which turns out to be too broad relative to the τ0subscript𝜏0\tau_{0}italic_τ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT values for some of the injected glitches —despite this choice being consistent with the general LPF decay time distribution shown in Fig. 1. To verify this hypothesis, we repeated the analysis using a logarithmic prior on β0subscript𝛽0\beta_{0}italic_β start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT and we observed that the low-SNR glitch is better constrained666Additionally, to exclude the possibility that the low-SNR glitch is consistent with noise, we have performed another test in which we fixed the number of glitches to seven, initialized the sampler at the true parameters and allowed the sampler to adjust slightly. Once the parameters stabilized, we restarted the run with RJ turned on, allowing the sampler to remove glitches. The sampler did not remove any of the glitches once they were found, indicating that—at least under the specific settings used in this example—this low SNR glitch is not consistent with noise.. However, it is important to underline that the recovery of the two noise parameters is nonetheless unbiased, as shown in Fig. 5. Therefore, the recovered PSD is well-constrained (Fig. 6) and closely aligns with the PSD model used to simulate the data, as illustrated in Fig. 6. In the lower plot of Fig. 4, we also observe a larger number of fictitious low-amplitude glitches (“ghost glitches”) arising from random fluctuations in the stationary component of the noise. These glitches have SNRs 24less-than-or-similar-toabsent24\lesssim 24≲ 24. We also plot the glitches from LPF with SNR<15𝑆𝑁𝑅15SNR<15italic_S italic_N italic_R < 15 and 15<SNR<3015𝑆𝑁𝑅3015<SNR<3015 < italic_S italic_N italic_R < 30. Notably, the low-SNR glitch with SNR=21𝑆𝑁𝑅21SNR=21italic_S italic_N italic_R = 21, which was not fitted, lies within the region of the latter. This suggests that glitches with parameters in the same region as those glitches are consistent with noise, given our choice of prior.

Finally, Fig. 7 shows the number of estimated glitches in the data at zero temperature. Even though seven glitches were injected, the sampler prefers to have eight glitches, which means adding two “ghost glitches”, since one of the injected ones is not recovered. The key difference between the “ghost glitches” and the missing one is that the amplitudes of the former are so low that the associated sampled value of β0subscript𝛽0\beta_{0}italic_β start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT and τ0subscript𝜏0\tau_{0}italic_τ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT have no impact on the likelihood, and so, if such signals have significant prior support, they will readily be added and removed by the RJ step of the sampler.
The amplitude of the missing glitch is small enough to evade recovery, but not so low that the associated τ0subscript𝜏0\tau_{0}italic_τ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT and β0subscript𝛽0\beta_{0}italic_β start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT have no impact on the likelihood, thus glitches with a similar amplitude are less likely to be accepted. Therefore, only values of τ0subscript𝜏0\tau_{0}italic_τ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT and β0subscript𝛽0\beta_{0}italic_β start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT in the right portion of the parameter space give high enough likelihood. To conclude, the addition of “ghost glitches” is the result of allowing for low-amplitude glitches in the prior, which can be used to fit random excursions in the instrumental noise. As these fictitious glitches do not absorb much of the power in the data, this does not impact noise estimation, but nonetheless there are techniques to mitigate this effect. One approach, used for example in Littenberg et al. (2020) in the context of including white dwarf binaries in the LISA Global Fit, is to use a stricter prior on the SNR of the white dwarf binaries that are proposed to be added.

To estimate the lowest SNR glitch that can be distinguished from noise, we considered different scenarios in which the #41#41\#41# 41 glitch, the low SNR one that was not recovered in our analysis, was replaced by other glitches of increasing SNRs, starting from SNR 20 to 65. We concluded that, for the noise model used in this study, glitches with an SNR of a few tens cannot be distinguished from noise. In the next section, we show that such glitches do not introduce significant biases in the parameter estimation of MBHB. Biases start to appear from SNRs of 60similar-toabsent60\sim 60∼ 60. This observation could also be used to define a prior for glitches based on their SNRs to help mitigate the confusion noise from the low-SNR sources discussed in the preceding paragraph.

Refer to caption
Figure 7: Distribution of the number of glitches at zero temperature estimated by the dynamical parameter estimation procedure. The true value is marked with a dashed black line.

VI Effect of glitches on parameter estimation of a Massive Black Holes Binary

We consider now three different cases where we inject three glitches with SNRs of 21, 72 and 1544 respectively. The first glitch corresponds to #41#41\#41# 41, the second to #3#3\#3# 3 and the third to #248#248\#248# 248 of the LPF catalogue. The parameters of these glitches are reported in the Table 1. We inject one glitch at a time, substituting it with another in each subsequent example. We simulate for all the three cases a fiducial MBHB signal at redshift 4, with SNR 1152, in noisy data. The choice of the MBHB is driven by Fig. 3.5 of Colpi et al. (2024) which shows the typical SNRs and redshift of MBHBs sources expected in LISA as well as Fig. 3 in Amaro-Seoane et al. (2017).

The MBHB priors used for this analysis are given in Table  3. Uniform distributions are used on parameters {ln(Mt),q,a1,a2,dL,φref,λd,ψ,tref}𝑙𝑛subscript𝑀𝑡𝑞subscript𝑎1subscript𝑎2subscript𝑑𝐿subscript𝜑𝑟𝑒𝑓subscript𝜆𝑑𝜓subscript𝑡𝑟𝑒𝑓\{ln(M_{t}),q,a_{1},a_{2},d_{L},\varphi_{ref},\lambda_{d},\psi,t_{ref}\}{ italic_l italic_n ( italic_M start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ) , italic_q , italic_a start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_a start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT , italic_d start_POSTSUBSCRIPT italic_L end_POSTSUBSCRIPT , italic_φ start_POSTSUBSCRIPT italic_r italic_e italic_f end_POSTSUBSCRIPT , italic_λ start_POSTSUBSCRIPT italic_d end_POSTSUBSCRIPT , italic_ψ , italic_t start_POSTSUBSCRIPT italic_r italic_e italic_f end_POSTSUBSCRIPT }. The prior on total mass is natural log-uniform. For the inclination, we use a uniform prior in cosι𝜄\cos\iotaroman_cos italic_ι between -1 and 1, while the prior on the ecliptic latitude is uniform in sinβdsubscript𝛽𝑑\sin\beta_{d}roman_sin italic_β start_POSTSUBSCRIPT italic_d end_POSTSUBSCRIPT.

Parameter True Lower Bound Upper Bound
ln(Mt/M)𝑙𝑛subscript𝑀𝑡subscript𝑀direct-productln(M_{t}/M_{\odot})italic_l italic_n ( italic_M start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT / italic_M start_POSTSUBSCRIPT ⊙ end_POSTSUBSCRIPT ) 14.48 ln(105superscript10510^{5}10 start_POSTSUPERSCRIPT 5 end_POSTSUPERSCRIPT) ln(108superscript10810^{8}10 start_POSTSUPERSCRIPT 8 end_POSTSUPERSCRIPT)
q𝑞qitalic_q 0.4628 0.01 0.999
a1subscript𝑎1a_{1}italic_a start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT 0.7474 -1 1
a2subscript𝑎2a_{2}italic_a start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT 0.8388 -1 1
dL[Gpc]subscript𝑑𝐿delimited-[]𝐺𝑝𝑐d_{L}[Gpc]italic_d start_POSTSUBSCRIPT italic_L end_POSTSUBSCRIPT [ italic_G italic_p italic_c ] 37 0.01 1000100010001000
φrefsubscript𝜑𝑟𝑒𝑓\varphi_{ref}italic_φ start_POSTSUBSCRIPT italic_r italic_e italic_f end_POSTSUBSCRIPT [rad]delimited-[]𝑟𝑎𝑑[rad][ italic_r italic_a italic_d ] 1.194 0 2π𝜋\piitalic_π
cosι𝜄\cos\iotaroman_cos italic_ι 0.5 -1 1
λdsubscript𝜆𝑑\lambda_{d}italic_λ start_POSTSUBSCRIPT italic_d end_POSTSUBSCRIPT [rad]delimited-[]𝑟𝑎𝑑[rad][ italic_r italic_a italic_d ] 1.288 0 2 π𝜋\piitalic_π
sinβdsubscript𝛽𝑑\sin\beta_{d}roman_sin italic_β start_POSTSUBSCRIPT italic_d end_POSTSUBSCRIPT -0.298 -1 1
ψ𝜓\psiitalic_ψ [rad]delimited-[]𝑟𝑎𝑑[rad][ italic_r italic_a italic_d ] π/6𝜋6\pi/6italic_π / 6 0 π𝜋\piitalic_π
tref[s]subscript𝑡𝑟𝑒𝑓delimited-[]𝑠t_{ref}[s]italic_t start_POSTSUBSCRIPT italic_r italic_e italic_f end_POSTSUBSCRIPT [ italic_s ] 2.628×1062.628superscript1062.628\times 10^{6}2.628 × 10 start_POSTSUPERSCRIPT 6 end_POSTSUPERSCRIPT trefsubscript𝑡𝑟𝑒𝑓t_{ref}italic_t start_POSTSUBSCRIPT italic_r italic_e italic_f end_POSTSUBSCRIPT - 100 trefsubscript𝑡𝑟𝑒𝑓t_{ref}italic_t start_POSTSUBSCRIPT italic_r italic_e italic_f end_POSTSUBSCRIPT + 100
Table 3: True values of the MBHB parameters together with the prior range used for parameter estimations. We assume a flat prior on the specified range. Values of the masses are given in the detector frame.

We choose the injection time for each of the three glitches to maximize the impact of the glitch on the MBHB parameter estimation, following the procedure described in Spadaro et al. (2023). The idea is to maximize the overlap between the glitch and the MBHB signal, defined as:

𝒪(h1,h2)=(h1|h2)(h1|h1)(h2|h2).𝒪subscript1subscript2conditionalsubscript1subscript2conditionalsubscript1subscript1conditionalsubscript2subscript2\mathcal{O}(h_{1},h_{2})=\frac{(h_{1}|h_{2})}{\sqrt{(h_{1}|h_{1})(h_{2}|h_{2})% }}.caligraphic_O ( italic_h start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_h start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) = divide start_ARG ( italic_h start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT | italic_h start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) end_ARG start_ARG square-root start_ARG ( italic_h start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT | italic_h start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ) ( italic_h start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT | italic_h start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) end_ARG end_ARG . (31)

Here, h1subscript1h_{1}italic_h start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT stands for the MBHB signal and h2subscript2h_{2}italic_h start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT for the glitch. Defining ΔtΔ𝑡\Delta troman_Δ italic_t the difference between the MBHB merger time and the glitch injection time, we found that the best match for the glitch with SNR 21 is at |Δt|=6.80361Δ𝑡6.80361|\Delta t|=6.80361| roman_Δ italic_t | = 6.80361 minutes before the merger, for the glitch with SNR 72 it is at |Δt|=25.2338Δ𝑡25.2338|\Delta t|=25.2338| roman_Δ italic_t | = 25.2338 minutes before the merger, and for the glitch with SNR 1544 at |Δt|=11.5665Δ𝑡11.5665|\Delta t|=11.5665| roman_Δ italic_t | = 11.5665 minutes before the merger. Figure 8 shows the variation of the overlap versus time for the three glitches, where we have fixed the amplitudes and decay time of the glitches while varying the injection time. The dotted vertical lines indicate the injection times, with respect to the merger, at which the overlap is maximized for each of the three glitches. We show in Fig. 9 the three glitches injected at the maximum overlap time and the fiducial MBHB.

Note that the frequency of the MBHB evolves over time, and the maximum impact of a glitch is expected when its frequency overlaps with the instantaneous frequency of the MBHB signal. Acceleration glitches typically occur at frequencies on the order of 103superscript10310^{-3}10 start_POSTSUPERSCRIPT - 3 end_POSTSUPERSCRIPT Hz, which aligns more closely with the early inspiral phase. However, if the MBHB merger occurred at lower frequencies—as in the case studied in Spadaro et al. (2023)—the glitch would instead have the strongest overlap during the merger phase.

Refer to caption
Figure 8: Overlap between the MBHB and glitch signals as a function of the glitch injection time relative to the merger, where the decay time and amplitude of the glitches are kept fixed. The red, blue, and green solid curves are for the three glitches with SNRs of 21, 72 and 1544. The GW signal is fixed to that of our fiducial MBHB. Dashed vertical lines with matching colours denote the times that maximize the overlap.
Refer to caption
Figure 9: Fiducial MBHB waveform along with the glitches injected at their maximum overlap time.

The posteriors shown in Fig. 10 and Fig. 11 illustrate the effect of the presence of these different glitches if we attempt to recover the parameters of this MBHB without simultaneously fitting for both the artefact and the GW source. For comparison, the posteriors of the MBHB parameters obtained from data containing no glitches are reported in both plots, and the true values of the parameters are indicated by dashed black line. In the case of the SNR 21 glitch, shown in Fig 10, the glitch has less impact in the intrinsic parameters with respect to the impact we see in the extrinsic ones, as the shift in the posterior is less than the width of the posterior. The bias starts to be significant for the glitch with SNR 72 for several of the parameters, e.g. the total mass, the reference phase at merger and the ecliptic longitude and latitude. For the SNR 1544 glitch shown in Fig. 11, the two posteriors are completely disjoint, with all parameters significantly biased. To help the reader visualise the effect of not fitting such a glitch, we report only the posteriors for the intrinsic parameters. It is important to specify that the noise model has been fixed for these analyses and we expect that variations in the noise model would allow to partially absorb the impact of these glitches making the MBHB posterior more consistent with the injected values.

Let us recall that the injection times of the glitches were chosen specifically to maximize their impact on the parameter estimation of the MBHB. Their effect is expected to be smaller if they occur during the early inspiral. This point is further discussed in Appendix B, where we present posteriors illustrating how the presence of the SNR 1544 glitch affects parameter estimation for the same MBHB, depending on its injection time. However, since the coalescence is the loudest part of MBHB signals and largely drives parameter estimation, failing to captures glitches occuring near to it would severely jeopardize the science objectives of LISA. For instance, biases in sky localization and distance—such as those shown in Fig. 11 when not accounting for the glitch—could prevent the detection of an electromagnetic counterpart and lead to misidentification of the host galaxy, thereby compromising cosmological measurements based on bright or dark siren methods Gair et al. (2023); Caprini and Tamanini (2016); Tamanini et al. (2016); Mangiagli et al. (2023). This issue also extends to applications in fundamental physics, which rely on detecting extremely small deviations from general relativity. In this context, even small biases can lead to false detections and must be mitigated as much as possible Kwok et al. (2022); Gupta et al. (2025). Our analysis shows that, to avoid such pitfalls, glitches with SNRs of several tens or higher will need to be included in the model fitted to the data. This threshold also corresponds to the point at which glitches first become identifiable in the data, as discussed in the previous paragraph.

Refer to caption
Figure 10: Posteriors of a MBHB with SNR 1522 compared to two cases: a glitch of SNR 21 is injected at the maximum match but not fitted, a glitch of SNR 72 is injected at the maximum match but not fitted.
Refer to caption
Figure 11: Posteriors of the intrinsic parameters of an MBHB with SNR 1152 in the case where a glitch of SNR 1544 is injected -and not fitted - at the time maximising its match with the injected signal. We have shifted the parameters relative to the injected values. The fact that the posterior do not peak at 0 illustrates the bias caused by the glitch. For the sake of clarity we do show the corresponding posteriors for the only-MBHB case, as the statistical error is much smaller than the systematic bias observed in the figure.

In Fig. 12, we now show the posterior on the parameters of the MBHB signal obtained when simultaneously fitting for the glitch, and compare this to the posterior obtained in the absence of the glitch. We observe that by fitting the glitch, the MBHB posteriors are no longer biased, and the posteriors for the (only-)MBHB case overlap with those where the glitch has been accounted for, indicating that the best strategy is to fit for both. For completeness, we also show posteriors for the fitted glitch parameters in Fig. 13.

To conclude the section, we also investigated the effect of injecting glitches (#41#41\#41# 41) and (#3#3\#3# 3) into MBHB with a SNR of 320, in order to reproduce the example studied in Spadaro et al. (2023). The posteriors plot are reported in the Appendix C. For the short-duration glitch with SNR 21 (#41#41\#41# 41), we find results consistent with those reported there: the inferred parameters remain largely unbiased. In contrast, glitch (#3#3\#3# 3), with an SNR of 72, though also short in duration, introduces a more significant bias.

Refer to caption
Figure 12: Posteriors for the parameters of an MBHB with SNR 1152 obtained from data containing only the MBHB, compared to those obtained from data containing a glitch of SNR 1544, when the latter is also fitted.
Refer to caption
Figure 13: Posteriors on the parameters of the first order shapelet used to fit the injected glitch (#248#248\#248# 248) with SNR 1544.

VII “(Light-)Spritz” data set

In this section we describe the “Spritz” data set and the strategy we have used to analyze it. We describe the mock data in section VII.1 and the noise model used for this analysis in section VII.2, since it differs from the noise that was used to generate and analyse the data in the previous section VI. Finally, in Section VIII, we present the methodology used for the initial search and parameter estimation of glitches, noise, and MBHBs, followed by the subsequent analysis of the full dataset to refine the posterior estimates of the noise, gravitational wave signal, and artifacts.

VII.1 Introduction

Full details of the “Spritz” data-sets can be found in LDC (2022). The “Spritz” data release comprised three main data sets, of which we focus on the first one, which contains a single loud MBHB and three glitches as visible in Fig. 14 which shows the dataset in the time-frequency domain, and Fig. 15 that shows it in the frequency domain. The injected glitches in “Spritz” are referred to as double decay exponential glitches Sala (2023). All the data sets are realized with TDI X,Y𝑋𝑌X,Yitalic_X , italic_Y and Z𝑍Zitalic_Z second generation assuming Keplerian orbits.

Refer to caption
Figure 14: Time-frequency representation of the “Light-Spritz” data set.
Refer to caption
Figure 15: Fourier domain representation of the “Light-Spritz” data set.

The dataset also features more realistic instrumental noise (see section VII.2), a background of Galactic binaries and data gaps. Auxiliary data included in the dataset allows individual components to be perfectly subtracted, allowing analysts to focus on selected aspects of the dataset. Since we focus on the detection of glitches and their impact on MBHB parameter estimation, we removed the gaps and Galactic binaries, and for this reason, we renamed the modified data set as “Light-Spritz”.
Furthermore, the version of BBHx we are using for this paper (version 0.0.0) does not support the response model used to generate the MBHB in “Spritz”. The “Spritz” data was generated using the PhenomD waveform model Husa et al. (2016); Khan et al. (2016), with the LISA response evaluated for Keplerian orbits (time varying arm-lengths), but our version of BBHx does not include time varying arm-lengths response. Therefore, to rule out any systematic errors from this waveform mismatch Dhani et al. (2024), we replace the original “Spritz” injected signal with one with the same parameters (see Table 3 in sec. VI), but generated using the PhenomHM waveform (instead of PhenomD) with an equal arm response. We plan to include a more realistic treatment of the LISA constellation to compute the response, in a follow-up study that will analyse the full data set.

Regarding glitch modelling, Ref. Muratore (2021) showed that the mismatch between equal- and unequal/time-varying-arm models does not significantly impact glitch parameter reconstruction. For the noise, however, using equal, unequal, or time-varying arm-length models affects the high-frequency range, where the zeros of the TDI appear. To avoid related issues, we restrict our analysis to frequencies up to 2×102 Hztimes2E-2hertz2\text{\times}{10}^{-2}\text{\,}\mathrm{Hz}start_ARG start_ARG 2 end_ARG start_ARG times end_ARG start_ARG power start_ARG 10 end_ARG start_ARG - 2 end_ARG end_ARG end_ARG start_ARG times end_ARG start_ARG roman_Hz end_ARG Hartwig and Muratore (2022b), which does not alter the outcome of our analysis. Additionally, the null channel T𝑇Titalic_T cannot be well approximated using the equal-arm-length model Muratore et al. (2022); Adams and Cornish (2010). For these reasons, we perform the analysis using only the two sensitive channels, A𝐴Aitalic_A and E𝐸Eitalic_E, employing second-generation TDI under the assumption of equal arm lengths, for which sensitivity to discrepancies in arm-length modelling is negligible. For consistency, the MBHB signal is also computed using second-generation TDI variables, assuming constant and equal arm lengths.

As discussed in Sec. IV, we analyse the data in the frequency domain. However, converting the time-domain data provided in “Light-Spritz” to the frequency domain by using a fast-Fourier transform (FFT) without any pre-processing causes significant artefacts in the form of spectral leakage Hartwig (2021), linked to significant high-frequency content that leaks into the low frequency part of the “Light-Spritz” data set. We mitigate this effect by applying a low-pass filter to the data before taking the FFT. Another approach would be to apply a window function to the data before taking the FFT. This has the downside of introducing correlations between adjacent frequency bins, potentially biasing the parameter estimation using our diagonal likelihood model. Instead, by filtering the data, we obtain a diagonal likelihood (see Appendix E). Concretely, we use a first-order Butterworth filter with a cut-off frequency of 1×103 Hztimes1E-3hertz1\text{\times}{10}^{-3}\text{\,}\mathrm{Hz}start_ARG start_ARG 1 end_ARG start_ARG times end_ARG start_ARG power start_ARG 10 end_ARG start_ARG - 3 end_ARG end_ARG end_ARG start_ARG times end_ARG start_ARG roman_Hz end_ARG using the scipy.signal.butter function and then use the scipy.signal.filtfilt function to apply the filter twice in opposite time directions, such that it has zero impact on the phase of the underlying data (i.e., it does not cause any time-shift of the data). Moreover, in order to not bias our analysis, we apply a frequency domain model of the filter response to the noise, glitches and GW signal templates used in the likelihood as well. Due to the double application of the filter, the Fourier transforms appearing in the likelihood are scaled by |H(f)|2superscript𝐻𝑓2|H(f)|^{2}| italic_H ( italic_f ) | start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT, whereas PSDs are scaled by |H(f)|4superscript𝐻𝑓4|H(f)|^{4}| italic_H ( italic_f ) | start_POSTSUPERSCRIPT 4 end_POSTSUPERSCRIPT, where H(f)𝐻𝑓H(f)italic_H ( italic_f ) is the complex filter transfer function:

log=absent\displaystyle\log\mathcal{L}=roman_log caligraphic_L =
12f[4Δf|d~(f)h~(f,θ)|H(f)|2g~(f,ϕ)|H(f)|2|2S(f,λ)|H(f)|4\displaystyle-\frac{1}{2}\sum_{f}\Bigg{[}4\Delta f\,\frac{\left|\tilde{d}(f)-% \tilde{h}(f,\theta)|H(f)|^{2}-\tilde{g}(f,\phi)|H(f)|^{2}\right|^{2}}{S(f,% \lambda)|H(f)|^{4}}- divide start_ARG 1 end_ARG start_ARG 2 end_ARG ∑ start_POSTSUBSCRIPT italic_f end_POSTSUBSCRIPT [ 4 roman_Δ italic_f divide start_ARG | over~ start_ARG italic_d end_ARG ( italic_f ) - over~ start_ARG italic_h end_ARG ( italic_f , italic_θ ) | italic_H ( italic_f ) | start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT - over~ start_ARG italic_g end_ARG ( italic_f , italic_ϕ ) | italic_H ( italic_f ) | start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT | start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG start_ARG italic_S ( italic_f , italic_λ ) | italic_H ( italic_f ) | start_POSTSUPERSCRIPT 4 end_POSTSUPERSCRIPT end_ARG
+log{S(f,λ)|H(f)|4}].\displaystyle\qquad\qquad+\log\left\{S(f,\lambda)|H(f)|^{4}\right\}\Bigg{]}.+ roman_log { italic_S ( italic_f , italic_λ ) | italic_H ( italic_f ) | start_POSTSUPERSCRIPT 4 end_POSTSUPERSCRIPT } ] . (32)

VII.2 Noise model

To study – and model – just the noise in the “Light-Spritz” dataset, we further removed the glitch and MBHB components described in section VII.1. We then estimated the spectra using the scipy.signal.welch method with the following parameters: window=(’kaiser’, 10), nperseg=10713, noverlap=5356, detrend=’linear’. The LDC manual for “Spritz” LDC (2020) suggests that laser, backlink, acceleration and readout noises were enabled in the simulation. However, we find that the corresponding noise model provided as part of the LDC software suite (working group (2022), v.1.2.4) does not perfectly describe the “Spritz” simulated data.

To accurately fit the noise of the data set, we cross-checked against the default values for the noise sources included in the instrument simulation at the time Bayle et al. (2022) and consequently included the following noise components, expressed as PSDs:

  • Inter-spacecraft interferometer OMS noise,
    SOMSISI=(NOMSISI)2superscriptsubscript𝑆OMSISIsuperscriptsuperscriptsubscript𝑁𝑂𝑀𝑆𝐼𝑆𝐼2S_{\mathrm{OMS}}^{\mathrm{ISI}}=\left(N_{OMS}^{ISI}\right)^{2}italic_S start_POSTSUBSCRIPT roman_OMS end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_ISI end_POSTSUPERSCRIPT = ( italic_N start_POSTSUBSCRIPT italic_O italic_M italic_S end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_I italic_S italic_I end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT (white),
    with NOMSISI=6.35 pm/Hzsuperscriptsubscript𝑁𝑂𝑀𝑆𝐼𝑆𝐼times6.35pmHzN_{OMS}^{ISI}=$6.35\text{\,}\mathrm{p}\mathrm{m}\mathrm{/}\sqrt{\mathrm{Hz}}$italic_N start_POSTSUBSCRIPT italic_O italic_M italic_S end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_I italic_S italic_I end_POSTSUPERSCRIPT = start_ARG 6.35 end_ARG start_ARG times end_ARG start_ARG roman_pm / square-root start_ARG roman_Hz end_ARG end_ARG

  • Reference interferometer OMS noise,
    SOMSRFI=(NOMSRSI)2superscriptsubscript𝑆OMSRFIsuperscriptsuperscriptsubscript𝑁𝑂𝑀𝑆𝑅𝑆𝐼2S_{\mathrm{OMS}}^{\mathrm{RFI}}=\left(N_{OMS}^{RSI}\right)^{2}italic_S start_POSTSUBSCRIPT roman_OMS end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_RFI end_POSTSUPERSCRIPT = ( italic_N start_POSTSUBSCRIPT italic_O italic_M italic_S end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_R italic_S italic_I end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT (white),
    with NOMSRSI=3.32 pm/Hzsuperscriptsubscript𝑁𝑂𝑀𝑆𝑅𝑆𝐼times3.32pmHzN_{OMS}^{RSI}=$3.32\text{\,}\mathrm{p}\mathrm{m}\mathrm{/}\sqrt{\mathrm{Hz}}$italic_N start_POSTSUBSCRIPT italic_O italic_M italic_S end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_R italic_S italic_I end_POSTSUPERSCRIPT = start_ARG 3.32 end_ARG start_ARG times end_ARG start_ARG roman_pm / square-root start_ARG roman_Hz end_ARG end_ARG

  • Test-mass interferometer OMS noise,
    SOMSTMI=(NOMSTMI)2superscriptsubscript𝑆OMSTMIsuperscriptsuperscriptsubscript𝑁𝑂𝑀𝑆𝑇𝑀𝐼2S_{\mathrm{OMS}}^{\mathrm{TMI}}=\left(N_{OMS}^{TMI}\right)^{2}italic_S start_POSTSUBSCRIPT roman_OMS end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_TMI end_POSTSUPERSCRIPT = ( italic_N start_POSTSUBSCRIPT italic_O italic_M italic_S end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T italic_M italic_I end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT (white),
    with NOMSTMI=1.42 pm/Hzsuperscriptsubscript𝑁𝑂𝑀𝑆𝑇𝑀𝐼times1.42pmHzN_{OMS}^{TMI}=$1.42\text{\,}\mathrm{p}\mathrm{m}\mathrm{/}\sqrt{\mathrm{Hz}}$italic_N start_POSTSUBSCRIPT italic_O italic_M italic_S end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T italic_M italic_I end_POSTSUPERSCRIPT = start_ARG 1.42 end_ARG start_ARG times end_ARG start_ARG roman_pm / square-root start_ARG roman_Hz end_ARG end_ARG

  • TM acceleration noise,
    STM=(NTM)2(1+(0.4×103f)2)subscript𝑆TMsuperscriptsubscript𝑁𝑇𝑀21superscript0.4superscript103𝑓2S_{\mathrm{TM}}=\left(N_{TM}\right)^{2}\cdot\left(1+\left(\frac{0.4\times 10^{% -3}}{f}\right)^{2}\right)italic_S start_POSTSUBSCRIPT roman_TM end_POSTSUBSCRIPT = ( italic_N start_POSTSUBSCRIPT italic_T italic_M end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ⋅ ( 1 + ( divide start_ARG 0.4 × 10 start_POSTSUPERSCRIPT - 3 end_POSTSUPERSCRIPT end_ARG start_ARG italic_f end_ARG ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ),
    with NTM=2.4 fm/s/2HzN_{TM}=$2.4\text{\,}\mathrm{f}\mathrm{m}\mathrm{/}\mathrm{s}\mathrm{{}^{2}}% \mathrm{/}\sqrt{\mathrm{Hz}}$italic_N start_POSTSUBSCRIPT italic_T italic_M end_POSTSUBSCRIPT = start_ARG 2.4 end_ARG start_ARG times end_ARG start_ARG roman_fm / roman_s start_FLOATSUPERSCRIPT 2 end_FLOATSUPERSCRIPT / square-root start_ARG roman_Hz end_ARG end_ARG

  • RFI and TMI backlink noise,
    SBLRFI/TMI=(NBLRFI/TMI)2(1+(2×103f)4)superscriptsubscript𝑆BLRFITMIsuperscriptsuperscriptsubscript𝑁𝐵𝐿RFITMI21superscript2superscript103𝑓4S_{\mathrm{BL}}^{\mathrm{RFI/TMI}}=\left(N_{BL}^{\mathrm{RFI/TMI}}\right)^{2}% \cdot\left(1+\left(\frac{2\times 10^{-3}}{f}\right)^{4}\right)italic_S start_POSTSUBSCRIPT roman_BL end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_RFI / roman_TMI end_POSTSUPERSCRIPT = ( italic_N start_POSTSUBSCRIPT italic_B italic_L end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_RFI / roman_TMI end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ⋅ ( 1 + ( divide start_ARG 2 × 10 start_POSTSUPERSCRIPT - 3 end_POSTSUPERSCRIPT end_ARG start_ARG italic_f end_ARG ) start_POSTSUPERSCRIPT 4 end_POSTSUPERSCRIPT ),
    with NBLRFI/TMI=3 pm/Hzsuperscriptsubscript𝑁𝐵𝐿RFITMItimes3pmHzN_{BL}^{\mathrm{RFI/TMI}}=$3\text{\,}\mathrm{p}\mathrm{m}\mathrm{/}\sqrt{% \mathrm{Hz}}$italic_N start_POSTSUBSCRIPT italic_B italic_L end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_RFI / roman_TMI end_POSTSUPERSCRIPT = start_ARG 3 end_ARG start_ARG times end_ARG start_ARG roman_pm / square-root start_ARG roman_Hz end_ARG end_ARG.

For the analysis, we define the following combinations of the noise parameters to estimate the effective noise levels:

γ={N1[(NOMSISI)2+(NOMSRFI)2+(NBLRFI/TMI)2]1/2,N2[(NOMSTMI)2+(NBLRFI/TMI)2]1/2,NTM,𝛾casessubscript𝑁1superscriptdelimited-[]superscriptsuperscriptsubscript𝑁OMSISI2superscriptsuperscriptsubscript𝑁OMSRFI2superscriptsuperscriptsubscript𝑁BLRFITMI212otherwisesubscript𝑁2superscriptdelimited-[]superscriptsuperscriptsubscript𝑁OMSTMI2superscriptsuperscriptsubscript𝑁BLRFITMI212otherwisesubscript𝑁TMotherwise\gamma=\begin{cases}N_{1}\equiv\left[(N_{\mathrm{OMS}}^{\mathrm{ISI}})^{2}+(N_% {\mathrm{OMS}}^{\mathrm{RFI}})^{2}+(N_{\mathrm{BL}}^{\mathrm{RFI/TMI}})^{2}% \right]^{1/2},\\ N_{2}\equiv\left[(N_{\mathrm{OMS}}^{\mathrm{TMI}})^{2}+(N_{\mathrm{BL}}^{% \mathrm{RFI/TMI}})^{2}\right]^{1/2},\\ N_{\mathrm{TM}},\end{cases}italic_γ = { start_ROW start_CELL italic_N start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ≡ [ ( italic_N start_POSTSUBSCRIPT roman_OMS end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_ISI end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT + ( italic_N start_POSTSUBSCRIPT roman_OMS end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_RFI end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT + ( italic_N start_POSTSUBSCRIPT roman_BL end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_RFI / roman_TMI end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ] start_POSTSUPERSCRIPT 1 / 2 end_POSTSUPERSCRIPT , end_CELL start_CELL end_CELL end_ROW start_ROW start_CELL italic_N start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ≡ [ ( italic_N start_POSTSUBSCRIPT roman_OMS end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_TMI end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT + ( italic_N start_POSTSUBSCRIPT roman_BL end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_RFI / roman_TMI end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ] start_POSTSUPERSCRIPT 1 / 2 end_POSTSUPERSCRIPT , end_CELL start_CELL end_CELL end_ROW start_ROW start_CELL italic_N start_POSTSUBSCRIPT roman_TM end_POSTSUBSCRIPT , end_CELL start_CELL end_CELL end_ROW (33)

with N1=3.32×1012subscript𝑁13.32superscript1012N_{1}=3.32\times 10^{-12}italic_N start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT = 3.32 × 10 start_POSTSUPERSCRIPT - 12 end_POSTSUPERSCRIPT, N2=7.78×1012subscript𝑁27.78superscript1012N_{2}=7.78\times 10^{-12}italic_N start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT = 7.78 × 10 start_POSTSUPERSCRIPT - 12 end_POSTSUPERSCRIPT, and NTM=2.4×1015subscript𝑁TM2.4superscript1015N_{\mathrm{TM}}=2.4\times 10^{-15}italic_N start_POSTSUBSCRIPT roman_TM end_POSTSUBSCRIPT = 2.4 × 10 start_POSTSUPERSCRIPT - 15 end_POSTSUPERSCRIPT. As in section III.5, we first express the noises as equivalent displacement noise in m2 Hz1timesmeter2hertz1{\mathrm{m}}^{2}\text{\,}{\mathrm{Hz}}^{-1}start_ARG power start_ARG roman_m end_ARG start_ARG 2 end_ARG end_ARG start_ARG times end_ARG start_ARG power start_ARG roman_Hz end_ARG start_ARG - 1 end_ARG end_ARG or acceleration noise in m/2s4/Hz\mathrm{m}\mathrm{{}^{2}}\mathrm{/}\mathrm{s}^{4}\mathrm{/}\mathrm{Hz}roman_m start_FLOATSUPERSCRIPT 2 end_FLOATSUPERSCRIPT / roman_s start_POSTSUPERSCRIPT 4 end_POSTSUPERSCRIPT / roman_Hz. We then convert them to units of fractional frequency fluctuations as used in the LDC by applying the factors Cdispffd=(2πf/c)2superscriptsubscript𝐶dispffdsuperscript2𝜋𝑓𝑐2C_{\mathrm{disp}}^{\mathrm{ffd}}=\left({2\pi f}/{c}\right)^{2}italic_C start_POSTSUBSCRIPT roman_disp end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_ffd end_POSTSUPERSCRIPT = ( 2 italic_π italic_f / italic_c ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT and Caccffd=(c2πf)2superscriptsubscript𝐶accffdsuperscript𝑐2𝜋𝑓2C_{\mathrm{acc}}^{\mathrm{ffd}}=\left({c2\pi f}\right)^{-2}italic_C start_POSTSUBSCRIPT roman_acc end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_ffd end_POSTSUPERSCRIPT = ( italic_c 2 italic_π italic_f ) start_POSTSUPERSCRIPT - 2 end_POSTSUPERSCRIPT, where c𝑐citalic_c is again the speed of light.

Fig. 16 shows the ratio between the estimated spectra (the PSD computed from the “Spritz” only-noise data), the model included with the LDC tools, and the model we used. We see artefacts around the zeros of the TDI transfer function for both plots, owing to the use of an equal arm approximation in the model we are using to fit the noise. For our model, we estimated the armlength to be used in the equal arm model as an average over all six time-varying armlengths, evaluating to L/c8.323s𝐿𝑐8.323𝑠L/c\approx 8.323sitalic_L / italic_c ≈ 8.323 italic_s. Whereas we used the default value set in the LDC tools for the LDC model. This causes slightly increased artefacts for the LDC model around the zeros. We also observe that the LDC model slightly underestimates the noise in the upper half of the frequency band.

Refer to caption
Figure 16: Comparison between the power spectral density of the “Spritz” noise data set with the noise model reported in the LDC manual.

To conclude, some of these noises have different TDI transfer functions, which are derived in Quang Nam et al. (2023b) for TDI X,Y,Z𝑋𝑌𝑍X,Y,Zitalic_X , italic_Y , italic_Z as well as A,E,T𝐴𝐸𝑇A,E,Titalic_A , italic_E , italic_T. For our work, we only need the transfer functions for A𝐴Aitalic_A and E𝐸Eitalic_E, which turn out to be identical under our assumptions. We summarize them here for convenience. They include a common TDI factor for second generation TDI:

Cxx=16sin2(2πfL)sin2(4πfL).subscript𝐶𝑥𝑥16superscript22𝜋𝑓𝐿superscript24𝜋𝑓𝐿C_{xx}=16\sin^{2}(2\pi fL)\sin^{2}(4\pi fL).italic_C start_POSTSUBSCRIPT italic_x italic_x end_POSTSUBSCRIPT = 16 roman_sin start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ( 2 italic_π italic_f italic_L ) roman_sin start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ( 4 italic_π italic_f italic_L ) . (34)

Using this, we have

|HISI/RFI (OMS)AA|2superscriptsuperscriptsubscript𝐻ISI/RFI (OMS)𝐴𝐴2\displaystyle|H_{\text{ISI/RFI (OMS)}}^{AA}|^{2}| italic_H start_POSTSUBSCRIPT ISI/RFI (OMS) end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_A italic_A end_POSTSUPERSCRIPT | start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT =2Cxx(2+cos(2πfL))absent2subscript𝐶𝑥𝑥22𝜋𝑓𝐿\displaystyle=2C_{xx}\left(2+\cos(2\pi fL)\right)= 2 italic_C start_POSTSUBSCRIPT italic_x italic_x end_POSTSUBSCRIPT ( 2 + roman_cos ( 2 italic_π italic_f italic_L ) ) (35)
|HTMI (OMS)AA\displaystyle|H_{\text{TMI (OMS)}}^{AA}| italic_H start_POSTSUBSCRIPT TMI (OMS) end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_A italic_A end_POSTSUPERSCRIPT |2=\displaystyle|^{2}=| start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT =
Cxx(3+2cos(2πfL)+cos(4πfL))subscript𝐶𝑥𝑥322𝜋𝑓𝐿4𝜋𝑓𝐿\displaystyle C_{xx}\left(3+2\cos(2\pi fL)+\cos(4\pi fL)\right)italic_C start_POSTSUBSCRIPT italic_x italic_x end_POSTSUBSCRIPT ( 3 + 2 roman_cos ( 2 italic_π italic_f italic_L ) + roman_cos ( 4 italic_π italic_f italic_L ) ) (36)
|HTMAA|2superscriptsuperscriptsubscript𝐻TM𝐴𝐴2\displaystyle|H_{\text{TM}}^{AA}|^{2}| italic_H start_POSTSUBSCRIPT TM end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_A italic_A end_POSTSUPERSCRIPT | start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT =4|HTMI (OMS)AA|2,absent4superscriptsuperscriptsubscript𝐻TMI (OMS)𝐴𝐴2\displaystyle=4\cdot|H_{\text{TMI (OMS)}}^{AA}|^{2},= 4 ⋅ | italic_H start_POSTSUBSCRIPT TMI (OMS) end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_A italic_A end_POSTSUPERSCRIPT | start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT , (37)

such that the total noise can be described by

SAAtotal=SEEtotal=Cdispffd[|HTMI(OMS)AA|2(SOMSTMI+SBLTMI)+|HISI/RFI(OMS)AA|2(SOMSISI+SOMSRFI+SBLRFI)]+Caccffd|HTMAA|2STM.superscriptsubscript𝑆𝐴𝐴totalsuperscriptsubscript𝑆𝐸𝐸totalsuperscriptsubscript𝐶dispffddelimited-[]superscriptsuperscriptsubscript𝐻TMIOMS𝐴𝐴2superscriptsubscript𝑆OMSTMIsuperscriptsubscript𝑆BLTMIsuperscriptsuperscriptsubscript𝐻ISIRFIOMS𝐴𝐴2superscriptsubscript𝑆OMSISIsuperscriptsubscript𝑆OMSRFIsuperscriptsubscript𝑆BLRFIsuperscriptsubscript𝐶accffdsuperscriptsuperscriptsubscript𝐻TM𝐴𝐴2subscript𝑆TM\begin{split}S_{AA}^{\mathrm{total}}&=S_{EE}^{\mathrm{total}}=C_{\mathrm{disp}% }^{\mathrm{ffd}}\Big{[}|H_{\mathrm{TMI(OMS)}}^{AA}|^{2}\left(S_{\mathrm{OMS}}^% {\mathrm{TMI}}+S_{\mathrm{BL}}^{\mathrm{TMI}}\right)\\ &+|H_{\mathrm{ISI/RFI(OMS)}}^{AA}|^{2}\left(S_{\mathrm{OMS}}^{\mathrm{ISI}}+S_% {\mathrm{OMS}}^{\mathrm{RFI}}+S_{\mathrm{BL}}^{\mathrm{RFI}}\right)\Big{]}\\ &+C_{\mathrm{acc}}^{\mathrm{ffd}}|H_{\mathrm{TM}}^{AA}|^{2}S_{\mathrm{TM}}.% \end{split}start_ROW start_CELL italic_S start_POSTSUBSCRIPT italic_A italic_A end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_total end_POSTSUPERSCRIPT end_CELL start_CELL = italic_S start_POSTSUBSCRIPT italic_E italic_E end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_total end_POSTSUPERSCRIPT = italic_C start_POSTSUBSCRIPT roman_disp end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_ffd end_POSTSUPERSCRIPT [ | italic_H start_POSTSUBSCRIPT roman_TMI ( roman_OMS ) end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_A italic_A end_POSTSUPERSCRIPT | start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ( italic_S start_POSTSUBSCRIPT roman_OMS end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_TMI end_POSTSUPERSCRIPT + italic_S start_POSTSUBSCRIPT roman_BL end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_TMI end_POSTSUPERSCRIPT ) end_CELL end_ROW start_ROW start_CELL end_CELL start_CELL + | italic_H start_POSTSUBSCRIPT roman_ISI / roman_RFI ( roman_OMS ) end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_A italic_A end_POSTSUPERSCRIPT | start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ( italic_S start_POSTSUBSCRIPT roman_OMS end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_ISI end_POSTSUPERSCRIPT + italic_S start_POSTSUBSCRIPT roman_OMS end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_RFI end_POSTSUPERSCRIPT + italic_S start_POSTSUBSCRIPT roman_BL end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_RFI end_POSTSUPERSCRIPT ) ] end_CELL end_ROW start_ROW start_CELL end_CELL start_CELL + italic_C start_POSTSUBSCRIPT roman_acc end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_ffd end_POSTSUPERSCRIPT | italic_H start_POSTSUBSCRIPT roman_TM end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_A italic_A end_POSTSUPERSCRIPT | start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT italic_S start_POSTSUBSCRIPT roman_TM end_POSTSUBSCRIPT . end_CELL end_ROW (38)

VIII Analysis of the data set

The search-phase algorithm is visualized in Fig. 17 and explained in the subsection VIII.1, while the parameter estimation one is visualized in Fig. 18 and explained in section VIII.2.

Refer to caption
Figure 17: Schematic view of the search algorithm.
Refer to caption
Figure 18: Schematic view of the parameter estimation algorithm.

VIII.1 Search phase

We divide the time series into segments of approximately one day, with a 20%percent2020\%20 % overlap between adjacent segments. We therefore segment the “Light-Spritz” dataset into 29 stretches of data777We start the counting of the stretches of data from number 0. and we regenerate a time grid spanning from t0=0 ssubscript𝑡0times0secondt_{0}=$0\text{\,}\mathrm{s}$italic_t start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT = start_ARG 0 end_ARG start_ARG times end_ARG start_ARG roman_s end_ARG to tend=2 675 890 ssubscript𝑡𝑒𝑛𝑑times2675890st_{end}=$2\,675\,890\text{\,}\mathrm{s}$italic_t start_POSTSUBSCRIPT italic_e italic_n italic_d end_POSTSUBSCRIPT = start_ARG 2 675 890 end_ARG start_ARG times end_ARG start_ARG roman_s end_ARG. We consider broad prior ranges for the glitch parameters, allowing the distribution of A0subscript𝐴0A_{0}italic_A start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT and β0subscript𝛽0\beta_{0}italic_β start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT to extend beyond what is observed in LPF (Fig. 1), aiming to remain as agnostic as possible. The priors and the corresponding true values on the glitch and noise parameters for the search phase are reported in Table 4. The true value of the amplitude and decay times for the glitch parameters are taken from the “Spritz” LDC manual LISA LDC Team (nd), whereas the injection times are not explicitly given. In particular the amplitude ΔvΔ𝑣\Delta vroman_Δ italic_v is 2.20 pm s1times2.20timespicometersecond12.20\text{\,}\mathrm{pm}\text{\,}{\mathrm{s}}^{-1}start_ARG 2.20 end_ARG start_ARG times end_ARG start_ARG start_ARG roman_pm end_ARG start_ARG times end_ARG start_ARG power start_ARG roman_s end_ARG start_ARG - 1 end_ARG end_ARG end_ARG, while the two decay times are τ1=10 ssubscript𝜏1times10second\tau_{1}=$10\text{\,}\mathrm{s}$italic_τ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT = start_ARG 10 end_ARG start_ARG times end_ARG start_ARG roman_s end_ARG and τ2=11 ssubscript𝜏2times11second\tau_{2}=$11\text{\,}\mathrm{s}$italic_τ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT = start_ARG 11 end_ARG start_ARG times end_ARG start_ARG roman_s end_ARG. Our search is split in two phases.

Parameter True Value Lower Bound Upper Bound
τ0subscript𝜏0\tau_{0}italic_τ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT [s] t-start t-end
ln(A0/[ms2])subscript𝐴0delimited-[]𝑚superscript𝑠2\ln(A_{0}/[m\,s^{-2}])roman_ln ( italic_A start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT / [ italic_m italic_s start_POSTSUPERSCRIPT - 2 end_POSTSUPERSCRIPT ] ) -27.58 -35 -20
β0subscript𝛽0\beta_{0}italic_β start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT [s] 10.5 1 1×1051E51\text{\times}{10}^{5}start_ARG 1 end_ARG start_ARG times end_ARG start_ARG power start_ARG 10 end_ARG start_ARG 5 end_ARG end_ARG
N1subscript𝑁1N_{1}italic_N start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT [m/HzmHz\mathrm{m}\mathrm{/}\sqrt{\mathrm{Hz}}roman_m / square-root start_ARG roman_Hz end_ARG] 7.77×10127.77E-127.77\text{\times}{10}^{-12}start_ARG 7.77 end_ARG start_ARG times end_ARG start_ARG power start_ARG 10 end_ARG start_ARG - 12 end_ARG end_ARG 7×10127E-127\text{\times}{10}^{-12}start_ARG 7 end_ARG start_ARG times end_ARG start_ARG power start_ARG 10 end_ARG start_ARG - 12 end_ARG end_ARG 8×10128E-128\text{\times}{10}^{-12}start_ARG 8 end_ARG start_ARG times end_ARG start_ARG power start_ARG 10 end_ARG start_ARG - 12 end_ARG end_ARG
N2subscript𝑁2N_{2}italic_N start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT [m/HzmHz\mathrm{m}\mathrm{/}\sqrt{\mathrm{Hz}}roman_m / square-root start_ARG roman_Hz end_ARG] 3.32×10123.32E-123.32\text{\times}{10}^{-12}start_ARG 3.32 end_ARG start_ARG times end_ARG start_ARG power start_ARG 10 end_ARG start_ARG - 12 end_ARG end_ARG 2.5×10122.5E-122.5\text{\times}{10}^{-12}start_ARG 2.5 end_ARG start_ARG times end_ARG start_ARG power start_ARG 10 end_ARG start_ARG - 12 end_ARG end_ARG 3.5×10123.5E-123.5\text{\times}{10}^{-12}start_ARG 3.5 end_ARG start_ARG times end_ARG start_ARG power start_ARG 10 end_ARG start_ARG - 12 end_ARG end_ARG
NTMsubscript𝑁𝑇𝑀N_{TM}italic_N start_POSTSUBSCRIPT italic_T italic_M end_POSTSUBSCRIPT [m/s/2Hz\mathrm{m}\mathrm{/}\mathrm{s}\mathrm{{}^{2}}\mathrm{/}\sqrt{\mathrm{Hz}}roman_m / roman_s start_FLOATSUPERSCRIPT 2 end_FLOATSUPERSCRIPT / square-root start_ARG roman_Hz end_ARG] 2.4×10152.4E-152.4\text{\times}{10}^{-15}start_ARG 2.4 end_ARG start_ARG times end_ARG start_ARG power start_ARG 10 end_ARG start_ARG - 15 end_ARG end_ARG 2×10152E-152\text{\times}{10}^{-15}start_ARG 2 end_ARG start_ARG times end_ARG start_ARG power start_ARG 10 end_ARG start_ARG - 15 end_ARG end_ARG 3×10153E-153\text{\times}{10}^{-15}start_ARG 3 end_ARG start_ARG times end_ARG start_ARG power start_ARG 10 end_ARG start_ARG - 15 end_ARG end_ARG
Table 4: Table of priors for the glitch and noise parameters along with the true values. The variables ‘t-start’ and ‘t-end’ change depending on the stretch considered. Note that the following relationships hold between a double decay exponential glitch and the shapelet model used to fit it: A0=Δv11+τ2/τ1subscript𝐴0Δ𝑣11subscript𝜏2subscript𝜏1A_{0}=\Delta v\frac{1}{1+\tau_{2}/\tau_{1}}italic_A start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT = roman_Δ italic_v divide start_ARG 1 end_ARG start_ARG 1 + italic_τ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT / italic_τ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_ARG and β0=τ1+τ22subscript𝛽0subscript𝜏1subscript𝜏22\beta_{0}=\frac{\tau_{1}+\tau_{2}}{2}italic_β start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT = divide start_ARG italic_τ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT + italic_τ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_ARG start_ARG 2 end_ARG.

VIII.1.1 First phase

First, we fit only for both noise and glitches in each segment separately. We use the in-model and out-of-model moves described in sec. IV.1, to explore the parameter space. The number of glitches in each segment is estimated between 0 and 1 (i.e. the number of leaves in the glitch branch is 0 or 1). The search includes a stopping criterion such that if after 1000 iterations, more than 70%percent7070\%70 % of the samples have no glitch leaves, we conclude that no glitches are present in the segment and proceed to analyze the next one. Otherwise the segment is flagged as containing a possible glitch signal.

Table 5 reports the results of the first phase of the search pipeline, showing in which stretches of data glitches have been found and their SNRs. Note that the SNR reported in the table is estimated through the simultaneous fitting of glitches and noise. Specifically, to calculate the SNR, we compute the mean of the posterior distribution of SNR values for each data stretch during the search phase. This mean is obtained by drawing 3000 random samples from the posterior distribution of the glitch and noise parameters. However, the instrumental noise is not sufficiently constrained; therefore, these SNRs have to be considered as a rough estimate. Indeed, we observed that only a day of data is not enough to properly estimate and constrain all the noise parameters. This is particularly true in stretches of data where the late inspiral part of the MBHB is present (N26-N28). Therefore, we need to properly constrain the noise when processing the full month of data. This is further discussed in the Appendix D.

To perform this initial search, we used 25 walkers and 10 temperatures, and the run for each segment takes about 20 minutes.

N Time [s] % of 0 % of 1 Glitch
Leaves Leaves SNR
0 0.0 - 107995 86 13 0.60
1 72000 - 197995 66 36 1.27
2 162000 - 287995 87 13 0.39
3 252000 - 377995 86 14 0.47
4 342000 - 467995 85 15 0.50
5 432000 - 557995 83 16 0.54
\rowcolorred!30 6 522000 - 647995 10 90 294
7 612000 - 737995 58 42 1.28
8 702000 - 827995 87 13 0.42
9 792000 - 917995 87 13 0.43
10 882000 - 1007995 56 44 1.42
11 972000 - 1097995 79 21 1.46
12 1062000 - 1187995 85 15 0.51
13 1152000 - 1277995 84 15 0.52
14 1242000 - 1367995 86 14 0.45
15 1332000 - 1457995 87 12 0.40
16 1422000 - 1547995 87 13 0.45
17 1512000 - 1637995 80 20 0.65
18 1602000 - 1727995 84 16 0.65
19 1692000 - 1817995 86 14 0.48
20 1782000 - 1907995 86 14 0.47
21 1872000 - 1997995 85 15 3.34
22 1962000 - 2087995 85 15 0.44
\rowcolorred!30 23 2052000 - 2177995 3 97 269.14
\rowcolorred!30 24 2142000 - 2267995 6 93 265.68
25 2232000 - 2357995 81 18 0.77
\rowcolorcyan!3026 2322000 - 2447995 8 92 3.74
\rowcolorcyan!3027 2412000 - 2537995 3 97 5.85
\rowcolorred!30 28 2502000 - 2627995 1 98 188.99
\rowcoloryellow!50 29 2592000 - 2675890 1 99 520.26
Table 5: Table showing the number of leaves in each model and the glitch SNR for each stretch of data analyzed. N𝑁Nitalic_N labels the segment considered. The “% of 0 or 1 leaf” columns indicate the fraction of samples with 0 or 1 leaf, respectively. The orange lines indicate stretches of data where glitches were identified, the cyan lines mark regions where the inspiral phase of the MBHB becomes significant, and the yellow line highlights the stretch where the merger of the MBHB is present.

We observe five stretches with high SNR signals. Of these, four have been identified as glitches (orange line) and one as a MBHB (yellow line). All these stretches have at least 90%percent9090\%90 % of samples with only one leaf. Additionally, two other stretches (cyan line) have more than 92%percent9292\%92 % of samples with one leaf, although their estimated signal SNRs are much lower. This occurs because the inspiral phase of the MBHB waveform starts to contribute a non-negligible SNR, triggering the algorithm with evidence of a low-SNR signal. In the remaining stretches, the SNR values result from random noise fluctuation.

In the first phase of the search, the glitch model serves as a trigger to detect the presence of a signal in each stretch, whether it corresponds to a glitch or an MBHB signal. This approach is effective because the MBHB in “Light-Spritz” has a high SNR, leading to behavior similar to matched filtering, where the overlap between our template bank and the GW signal increases with the SNR (the characteristic frequency of the glitch and the instantaneous GW frequency of the MBHB). However, in cases with lower SNR MBHBs, the search may not be sensitive enough to detect such signals. To resolve ambiguities between the presence of a glitch or an MBHB signal in each stretch, the second phase of the search evaluates the Bayes factor comparing the glitch and MBHB hypotheses.

VIII.1.2 Second phase

The first phase of our search aims to identify the presence of a signal in each stretch of data. In the second phase, we determine whether the signal is better described as a glitch or a MBHB. To this end, we perform a dedicated MBHB search to assess the presence of a GW signal in stretches where a signal was detected or where the estimated SNR is sufficiently high (N6, N23, N24, N26, N27, N28, and N29). We then compute the Bayes factor comparing the MBHB and glitch hypotheses888Note that this analysis could alternatively be performed using the RJ-MCMC framework, which would allow for automatic computation of Bayes factors between glitches and MBHBs by leveraging the Erebor Global Fit infrastructure Katz et al. (2024). However, incorporating this additional complexity is not necessary for the current analysis.. For this, we employ both the stepping-stone method Maturana-Russel et al. (2019) and thermodynamic integration Karnesis et al. (2023). This information is subsequently used to initialize the walkers for the parameter estimation phase. We then compare the Bayes factors between the glitch and MBHB hypotheses for each of these stretches. To this end, we use 30 temperatures and run the sampler with a fixed-dimensional parameter space. We fit only for the signal, assuming the noise to be at a fiducial level, following the strategy outlined in Katz et al. (2024). This approach is robust given the high SNR of both glitches and MBHBs, and no ambiguous Bayes factor values were observed that would suggest the simultaneous presence of both components. In potential edge cases, the existing methodology in the Erebor Global Fit framework Katz et al. (2024) offers a viable and complementary strategy for MBHB detection.

We show in Table 6 three representative examples of the log Bayes factor for stretches N6, N24, and N29, where a signal (either a glitch or an MBHB) was identified. For brevity, we report only the results obtained using the stepping-stone method. In the Appendix F are reported the results obtained using the thermodynamic integration for completeness. The Bayes factors across all considered stretches were found to be highly decisive, strongly favouring one hypothesis over the other, either the presence of a glitch or that of an MBHB. For this reason, we present only a subset of representative results here, as the conclusions were clear-cut in all cases. Specifically, for stretches N6 and N24, the presence of a glitch is strongly favored, while for stretch N29, the MBHB hypothesis is preferred.

N6 N24 N29
logMBHB/glitchsubscriptMBHBglitch\log\mathcal{B}_{\mathrm{MBHB}/\mathrm{glitch}}roman_log caligraphic_B start_POSTSUBSCRIPT roman_MBHB / roman_glitch end_POSTSUBSCRIPT 6.1×104similar-toabsent6.1superscript104\sim-6.1\times 10^{4}∼ - 6.1 × 10 start_POSTSUPERSCRIPT 4 end_POSTSUPERSCRIPT 3.9×104similar-toabsent3.9superscript104\sim-3.9\times 10^{4}∼ - 3.9 × 10 start_POSTSUPERSCRIPT 4 end_POSTSUPERSCRIPT 1.5×107similar-toabsent1.5superscript107\sim 1.5\times 10^{7}∼ 1.5 × 10 start_POSTSUPERSCRIPT 7 end_POSTSUPERSCRIPT
dlogMBHB/glitch𝑑subscriptMBHBglitchd\log\mathcal{B}_{\mathrm{MBHB}/\mathrm{glitch}}italic_d roman_log caligraphic_B start_POSTSUBSCRIPT roman_MBHB / roman_glitch end_POSTSUBSCRIPT 0.61similar-toabsent0.61\sim 0.61∼ 0.61 0.22similar-toabsent0.22\sim 0.22∼ 0.22 0.44similar-toabsent0.44\sim 0.44∼ 0.44
Table 6: Log Bayes factors comparing the MBHB and glitch models for datasets N6, N24, and N29 using the stepping-stone method. dlogMBHB/glitch𝑑subscriptMBHBglitchd\log\mathcal{B}_{\mathrm{MBHB}/\mathrm{glitch}}italic_d roman_log caligraphic_B start_POSTSUBSCRIPT roman_MBHB / roman_glitch end_POSTSUBSCRIPT gives an estimate on the error on logMBHB/glitchsubscriptMBHBglitch\log\mathcal{B}_{\mathrm{MBHB}/\mathrm{glitch}}roman_log caligraphic_B start_POSTSUBSCRIPT roman_MBHB / roman_glitch end_POSTSUBSCRIPT.

During the search phase, along with computing the Bayes factor we also obtain a rough estimate of the parameters of the MBHB. Doing parameter estimation for the MBHB using only a day of data highlights several key considerations regarding frequency domain analysis and the use of the discrete Fourier transform. One primary concern is spectral leakage, particularly evident when the pre-merger/inspiral phase of the signal is analyzed, which necessitates the use of windowing techniques to mitigate its effects Davies et al. (2024). Indeed, when analyzing frequency-domain waveforms over short time segments, particular caution is required to ensure accuracy. However, in the context of this research, we did not window the data as it was not strictly necessary since we were not interested in characterizing the pre-merger part of the signals999As illustrated in Appendix E, filtering the data favours no correlation between adjacent frequency bins.. These issues reflect a fundamental challenge in performing low-latency analyses of MBHB signals Cornish (2022), but once again we do not need to be so careful in the search phase, as the final characterization of the signals uses the full data set.

To conclude on a more technical side, to be able to perform the analysis of the MBHB with BBHx but restricting it to a single day, we need to account for the time window considered when generating the MBHB template with BBHx. This is achieved by shifting the phase with respect to the starting time of our window, and by defining a start and an end time of the observation to cut the waveform, such that any frequency components of the signal that are emitted outside of this range are excluded.

VIII.2 Parameter estimation phase

For the final parameter estimation phase we consider the full month of data. We include an MBHB template in our likelihood and consider the same prior as the one reported in Table 3, but restrict the prior on the time of coalescence between tref24s<tref<tref+24ssubscript𝑡𝑟𝑒𝑓24𝑠subscript𝑡𝑟𝑒𝑓subscript𝑡𝑟𝑒𝑓24𝑠t_{ref}-24s<t_{ref}<t_{ref}+24sitalic_t start_POSTSUBSCRIPT italic_r italic_e italic_f end_POSTSUBSCRIPT - 24 italic_s < italic_t start_POSTSUBSCRIPT italic_r italic_e italic_f end_POSTSUBSCRIPT < italic_t start_POSTSUBSCRIPT italic_r italic_e italic_f end_POSTSUBSCRIPT + 24 italic_s according to the MBHB posteriors found during the search phase. We also consider a slightly restricted prior range for the glitch parameters based on the findings from the search phase as shown in Table 7. In particular, we restrict the times of injection and exclude combinations of β0subscript𝛽0\beta_{0}italic_β start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT and A0subscript𝐴0A_{0}italic_A start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT that do not reflect the glitch distribution found in LPF. If such glitches were present, they would have already been identified during the search phase given the high SNR of those glitches.

We then set the minimum number of glitches for the reversible jump to match the number of high SNR glitches identified in the search phase, and we set the maximum number of glitches to be double the minimum to allow for glitches that could have been missed during the search.

We consider the same in-model and out-of-model moves used during the search phase for the glitches. The in-model moves get updated every 20 iterations up to a total of 50000 iterations. Instead, for the in-model Gaussian move proposal of the MBHB we estimate the covariance matrix using samples from the MBHB parameter posteriors of the search phase. We also incorporate the covariance matrices computed from the three estimated glitches -during the search phase- to define an additional reversible jump move proposal (see section IV.1.2) which proposes the birth or death of a glitch based on those identified during the search. To be on the safe side, we still allow 40%percent4040\%40 % of proposed glitch parameters to be drawn from the full prior range, ensuring that we do not miss any potential glitches.

Finally, we initialise the MCMC walkers using samples found during the search phase, and we estimate the parameters of the noise, signal, and glitches simultaneously following a Gibbs sampling approach Karnesis et al. (2023).

The posteriors for the three glitches are shown in Figs. 19,20,21, for the MBHB are shown in Figs. 22 and for the noise parameters in Fig. 23. The MBHB found has an SNR of 1522 while the glitches have SNRs of around 300. For all the posteriors, we only plot the last 500 samples of each walker, as we consider the previous samples, where we perform recursive updates of the in-model moves, as a burn-in period.

Refer to caption
Figure 19: Posteriors of the MBHB in the “Light-Spritz” data set.
Refer to caption
Figure 20: Posteriors of the glitch injected around 166 hours in the “Light-Spritz” data set.
Refer to caption
Figure 21: Posteriors of the glitch injected around 718 hours in the “Light-Spritz” data set.
Refer to caption
Figure 22: Posteriors of the glitch injected around 598 hours in the “Light-Spritz” data set.
Refer to caption
Figure 23: Posteriors of noise parameters estimated in the “Light-Spritz” data set.
Parameter Lower Upper
Bound Bound
τ0subscript𝜏0\tau_{0}italic_τ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT [s] 432000 2627995
ln(A0/[ms2])𝑙𝑛subscript𝐴0delimited-[]𝑚superscript𝑠2ln(A_{0}/[ms^{-2}])italic_l italic_n ( italic_A start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT / [ italic_m italic_s start_POSTSUPERSCRIPT - 2 end_POSTSUPERSCRIPT ] ) -32 -20
β0subscript𝛽0\beta_{0}italic_β start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT [s] 1 1e4
N1subscript𝑁1N_{1}italic_N start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT [m/HzmHz\mathrm{m}\mathrm{/}\sqrt{\mathrm{Hz}}roman_m / square-root start_ARG roman_Hz end_ARG] 7×10127E-127\text{\times}{10}^{-12}start_ARG 7 end_ARG start_ARG times end_ARG start_ARG power start_ARG 10 end_ARG start_ARG - 12 end_ARG end_ARG 8×10128E-128\text{\times}{10}^{-12}start_ARG 8 end_ARG start_ARG times end_ARG start_ARG power start_ARG 10 end_ARG start_ARG - 12 end_ARG end_ARG
N2subscript𝑁2N_{2}italic_N start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT [m/HzmHz\mathrm{m}\mathrm{/}\sqrt{\mathrm{Hz}}roman_m / square-root start_ARG roman_Hz end_ARG] 2.5×10122.5E-122.5\text{\times}{10}^{-12}start_ARG 2.5 end_ARG start_ARG times end_ARG start_ARG power start_ARG 10 end_ARG start_ARG - 12 end_ARG end_ARG 3.5×10123.5E-123.5\text{\times}{10}^{-12}start_ARG 3.5 end_ARG start_ARG times end_ARG start_ARG power start_ARG 10 end_ARG start_ARG - 12 end_ARG end_ARG
NTMsubscript𝑁𝑇𝑀N_{TM}italic_N start_POSTSUBSCRIPT italic_T italic_M end_POSTSUBSCRIPT [m/s/2Hz\mathrm{m}\mathrm{/}\mathrm{s}\mathrm{{}^{2}}\mathrm{/}\sqrt{\mathrm{Hz}}roman_m / roman_s start_FLOATSUPERSCRIPT 2 end_FLOATSUPERSCRIPT / square-root start_ARG roman_Hz end_ARG] 2.2×10152.2E-152.2\text{\times}{10}^{-15}start_ARG 2.2 end_ARG start_ARG times end_ARG start_ARG power start_ARG 10 end_ARG start_ARG - 15 end_ARG end_ARG 2.7×10152.7E-152.7\text{\times}{10}^{-15}start_ARG 2.7 end_ARG start_ARG times end_ARG start_ARG power start_ARG 10 end_ARG start_ARG - 15 end_ARG end_ARG
Table 7: Table of priors for the glitch and noise parameters. The true values are given in Table 4.

IX Conclusion and future perspective

In this work, we have demonstrated a potential Global Fit approach for LISA to estimate MBHB parameters in the presence of noise artefacts by implementing a search pipeline to find and fit for glitches along with the noise properties of the instrument and the parameters of an MBHB, using realistic simulated LISA data. Our methodology provides a general framework for detecting and modelling glitches, ensuring that such instrumental artefacts do not compromise LISA’s scientific output.

We have developed and validated our pipeline in two steps. First, we generated synthetic data where multiple glitches of varying SNRs, including those closely spaced in time, were injected to test the effectiveness of our reversible jump approach in handling diverse glitch morphologies. Our injected glitches were randomly drawn from the LPF glitch distribution to provide a realistic test scenario. We established an approximate threshold for distinguishing glitches from instrumental noise and investigated the impact of low, medium and high SNR glitches on the parameter estimation of a fiducial MBHB with SNR of 1152. We found that unfitted glitches with SNR greater-than-or-equivalent-to\gtrsim 60 introduce non-negligible biases in MBHB parameter recovery, especially for low-SNR MBHB signals. For instance, for an unfitted glitch with SNR 21 we observed a slight shift in the parameter posterior relative to the injected values, but this shift was small compared to the statistical uncertainties, whereas for a glitch with SNR 72, the true values were excluded at high confidence from the posterior support for several parameters, such as the total mass or the sky location. For the extreme case of an SNR 1544 glitch, biases were very large and the posterior was completely disjoined to the case were no glitch is present. Moreover, we noticed that the relative SNR between the glitch and the MBHB signal, as well as the glitch’s temporal proximity to the merger, are critical factors in determining the extent of bias. However, these are not the sole contributors. For instance, when comparing the posterior distributions of MBHB systems with SNRs 1152 and 320, the glitch number 41 in the LPF catalogue appears to cause a more pronounced bias in the recovery of parameters for the low-SNR system hinting that also the relationship between the glitch frequency and the instananeous frequency of the MBHB plays a role. When fitting for the glitch(-es) (single glitch as well as multiple glitches), we corrected for these biases and estimated the parameters of the MBHB correctly, with almost the same accuracy when no glitch was present. This study stresses the importance of accounting for glitches to fully exploit the tremendous scientific potential of MBHB observations with LISA—whether for fundamental physics Berti et al. (2006, 2016); Chamberlain and Yunes (2017); Barausse et al. (2020b); Maggio et al. (2020); Bhagwat et al. (2022); Corman et al. (2022); Toubiana et al. (2024b); Pitte et al. (2024), astrophysics Gair et al. (2011); Sesana et al. (2011); Klein et al. (2016); Dayal et al. (2019); Barausse et al. (2020a); Toubiana et al. (2021); Chen et al. (2022); Fang and Yang (2023); Spadaro et al. (2025); Langen et al. (2024); Toubiana et al. (2024a), or cosmology Caprini and Tamanini (2016); Tamanini et al. (2016); Mangiagli et al. (2023).

Second, we applied our search algorithm to a simplified version of the “Spritz” dataset, which was obtained by removing the Galactic binaries and gaps from the original dataset. Our results show that we can successfully implement a search and a parameter estimation pipeline to detect and model glitches present in the data while recovering both the instrumental noise and the MBHB parameters without bias. Additionally, we provided a detailed noise model for this dataset after identifying inconsistencies between the official LDC “Spritz” documentation LDC (2022) and the actual procedure used to generate data for this mock data set. This information will be valuable for future researchers analyzing the same dataset.

A key aspect that warrants further investigation is the accuracy and efficiency of the GMM clustering approach used to build proposal distributions for glitches in the parameter estimation phase, particularly in determining the number of components. This is particularly important in view of including the search and parameter estimation for glitches, developed in this work, in the framework of the Erebor Global Fit Katz et al. (2024), in which multiple MBHBs and other types of GW source are present. In the current implementation, the number of clusters is fixed to match the maximum number of leaves (components) in the model, without attempting to infer the true number of underlying glitch populations. While this approach offers flexibility, and has demonstrated to work very well in our framework, it may introduce inefficiencies when the number of GMM components significantly exceeds the actual number of distinct glitches. More specifically, in scenarios where the GMM overestimates the number of glitch clusters, the model may fragment individual glitch posteriors into sub-posteriors, which can hinder efficient sampling. As a result, MCMC chains may struggle to traverse the full posterior landscape, especially across different glitch modes, even if local acceptance rates remain high. We have begun exploring alternative strategies to dynamically estimate the appropriate number of clusters, potentially improving the exchange of samples across posterior modes and enhancing the overall sampling efficiency. This might play a more significant role when more GW sources or more complicated noise components, with respect to the “Light-Spritz” data set, are considered.

Nonetheless, the glitch search, determination, and parameter estimation components developed in this work will form the foundation of the glitch handling module within Erebor. From Erebor, we will incorporate the already developed modules for MBHB searches and parameter estimation as well as the search and parameter estimation for Galactic binaries, noise and foreground fitting. Over time, we expect to refine both the glitch and MBHB modules based on the insights gained here, particularly in trans-dimensional sampling strategies. We will also update the noise fitting by leveraging the more complete noise models reported in this work.

To conclude, future work will also extend this approach to account for instrumental non-stationarities, such as data gaps and the presence of the Galactic foreground so that data with the more realistic properties of the “Spritz” data set can be handled. While our study serves as a guideline for glitch mitigation in LISA, it does not encompass all possible MBHB populations, and our reference glitch distribution is based on LPF data. However, the shapelet-based approach should readily adapt to different glitch morphologies, with respect to the signal exponential ones addressed in this paper, present in the data in particular by leveraging higher order shapelets, but its efficacy needs to be properly quantified. However, since LISA will use similar technology to LPF, at least for the GRS, the acceleration glitches are unlikely to differ substantially, so the results obtained here should represent the expected performance on actual LISA data.

All code used in this work is open-source and can be accessed at https://github.com/martinaAEI/artifacts.

Acknowledgements.
M.M. gratefully acknowledges the support of the German Space Agency, DLR. The work is supported by the Federal Ministry for Economic Affairs and Climate Action based on a resolution of the German Bundestag (Project Ref. No. FKZ 50 OQ 2301). The authors thank the following people (alphabetically ordered) for useful discussion in improving the work: Quentin Baghi, Jean-Baptiste Bayle, Ollie Burke, Eleonora Castelli, Raffi Enficiaud, Hector Hestelles, Niklas Houba, Nikos Karnesis, Lorenzo Pompili, Lorenzo Sala, Elise Sangër, Alessandro Santini, Lorenzo Speri, Alice Spadaro, Mauro Pieroni and Sebastian Völkel.

Appendix A Glitch model

A.1 Derivation of shapelet model in time and frequency domains

Shapelets can be defined in the time domain as follows:

ψn(t)=2tnetnLn11(2tn)Θ(t),subscript𝜓𝑛𝑡2𝑡𝑛superscript𝑒𝑡𝑛subscriptsuperscript𝐿1𝑛12𝑡𝑛Θ𝑡\psi_{n}(t)=2\frac{t}{n}e^{-\frac{t}{n}}L^{1}_{n-1}\bigg{(}2\frac{t}{n}\bigg{)% }\Theta(t),italic_ψ start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ( italic_t ) = 2 divide start_ARG italic_t end_ARG start_ARG italic_n end_ARG italic_e start_POSTSUPERSCRIPT - divide start_ARG italic_t end_ARG start_ARG italic_n end_ARG end_POSTSUPERSCRIPT italic_L start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_n - 1 end_POSTSUBSCRIPT ( 2 divide start_ARG italic_t end_ARG start_ARG italic_n end_ARG ) roman_Θ ( italic_t ) , (39)

with Lnαsuperscriptsubscript𝐿𝑛𝛼L_{n}^{\alpha}italic_L start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_α end_POSTSUPERSCRIPT denoting a generalized Laguerre polynomial. A glitch perturbation is then modeled by a linear combination of shapelets:

g(t)=i=0PAiψni(tτiβi),𝑔𝑡superscriptsubscript𝑖0𝑃subscript𝐴𝑖subscript𝜓subscript𝑛𝑖𝑡subscript𝜏𝑖subscript𝛽𝑖g(t)=\sum_{i=0}^{P}A_{i}\psi_{n_{i}}\bigg{(}\frac{t-\tau_{i}}{\beta_{i}}\bigg{% )},italic_g ( italic_t ) = ∑ start_POSTSUBSCRIPT italic_i = 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_P end_POSTSUPERSCRIPT italic_A start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT italic_ψ start_POSTSUBSCRIPT italic_n start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_POSTSUBSCRIPT ( divide start_ARG italic_t - italic_τ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_ARG start_ARG italic_β start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_ARG ) , (40)

with P+1𝑃1P+1italic_P + 1 being the number of shapelets. Substituting Eq. (39) into Eq. (40) we have:

g(t)𝑔𝑡\displaystyle g(t)italic_g ( italic_t ) =i=0P2Ai1nitτiβie1nitτiβiabsentsuperscriptsubscript𝑖0𝑃2subscript𝐴𝑖1subscript𝑛𝑖𝑡subscript𝜏𝑖subscript𝛽𝑖superscript𝑒1subscript𝑛𝑖𝑡subscript𝜏𝑖subscript𝛽𝑖\displaystyle=\sum_{i=0}^{P}2A_{i}\frac{1}{n_{i}}\frac{t-\tau_{i}}{\beta_{i}}e% ^{-\frac{1}{n_{i}}\frac{t-\tau_{i}}{\beta_{i}}}= ∑ start_POSTSUBSCRIPT italic_i = 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_P end_POSTSUPERSCRIPT 2 italic_A start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT divide start_ARG 1 end_ARG start_ARG italic_n start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_ARG divide start_ARG italic_t - italic_τ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_ARG start_ARG italic_β start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_ARG italic_e start_POSTSUPERSCRIPT - divide start_ARG 1 end_ARG start_ARG italic_n start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_ARG divide start_ARG italic_t - italic_τ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_ARG start_ARG italic_β start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_ARG end_POSTSUPERSCRIPT
×Lni11(21nitτiβi)Θ(tτiβi),absentsubscriptsuperscript𝐿1subscript𝑛𝑖121subscript𝑛𝑖𝑡subscript𝜏𝑖subscript𝛽𝑖Θ𝑡subscript𝜏𝑖subscript𝛽𝑖\displaystyle\quad\times L^{1}_{n_{i}-1}\bigg{(}2\frac{1}{n_{i}}\frac{t-\tau_{% i}}{\beta_{i}}\bigg{)}\Theta\bigg{(}\frac{t-\tau_{i}}{\beta_{i}}\bigg{)},× italic_L start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_n start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT - 1 end_POSTSUBSCRIPT ( 2 divide start_ARG 1 end_ARG start_ARG italic_n start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_ARG divide start_ARG italic_t - italic_τ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_ARG start_ARG italic_β start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_ARG ) roman_Θ ( divide start_ARG italic_t - italic_τ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_ARG start_ARG italic_β start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_ARG ) , (41)

note that Eq. (A.1) refers to a glitch in acceleration but we need to derive the correspondent fractional frequency shift, which means integrating g(t)𝑔𝑡g(t)italic_g ( italic_t ) and dividing by the velocity of light c𝑐citalic_c.

To perform the integral, we can consider the generating function:

n=0vnLnα(x)=1(1v)α+1exv1v.superscriptsubscript𝑛0superscript𝑣𝑛superscriptsubscript𝐿𝑛𝛼𝑥1superscript1𝑣𝛼1superscript𝑒𝑥𝑣1𝑣\sum_{n=0}^{\infty}v^{n}L_{n}^{\alpha}(x)=\frac{1}{(1-v)^{\alpha+1}}e^{-\frac{% xv}{1-v}}.∑ start_POSTSUBSCRIPT italic_n = 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∞ end_POSTSUPERSCRIPT italic_v start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT italic_L start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_α end_POSTSUPERSCRIPT ( italic_x ) = divide start_ARG 1 end_ARG start_ARG ( 1 - italic_v ) start_POSTSUPERSCRIPT italic_α + 1 end_POSTSUPERSCRIPT end_ARG italic_e start_POSTSUPERSCRIPT - divide start_ARG italic_x italic_v end_ARG start_ARG 1 - italic_v end_ARG end_POSTSUPERSCRIPT . (42)

To align it with Eq. (39), we adjust the index of the series as follows:

n=1vnLn1α(x)=v(1v)α+1exv1v.superscriptsubscript𝑛1superscript𝑣𝑛superscriptsubscript𝐿𝑛1𝛼𝑥𝑣superscript1𝑣𝛼1superscript𝑒𝑥𝑣1𝑣\sum_{n=1}^{\infty}v^{n}L_{n-1}^{\alpha}(x)=\frac{v}{(1-v)^{\alpha+1}}e^{\frac% {-xv}{1-v}}.∑ start_POSTSUBSCRIPT italic_n = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∞ end_POSTSUPERSCRIPT italic_v start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT italic_L start_POSTSUBSCRIPT italic_n - 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_α end_POSTSUPERSCRIPT ( italic_x ) = divide start_ARG italic_v end_ARG start_ARG ( 1 - italic_v ) start_POSTSUPERSCRIPT italic_α + 1 end_POSTSUPERSCRIPT end_ARG italic_e start_POSTSUPERSCRIPT divide start_ARG - italic_x italic_v end_ARG start_ARG 1 - italic_v end_ARG end_POSTSUPERSCRIPT . (43)

and then define

G(t,τ,n,β,A,v)𝐺𝑡𝜏𝑛𝛽𝐴𝑣\displaystyle G(t,\tau,n,\beta,A,v)italic_G ( italic_t , italic_τ , italic_n , italic_β , italic_A , italic_v )
=2Ac0t(tτnβ)exp[tτnβ]absent2𝐴𝑐superscriptsubscript0𝑡superscript𝑡𝜏𝑛𝛽superscript𝑡𝜏𝑛𝛽\displaystyle\hskip 14.22636pt=\frac{2A}{c}\int_{0}^{t}\left(\frac{t^{\prime}-% \tau}{n\beta}\right)\exp\left[{-\frac{t^{\prime}-\tau}{n\beta}}\right]= divide start_ARG 2 italic_A end_ARG start_ARG italic_c end_ARG ∫ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT ( divide start_ARG italic_t start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT - italic_τ end_ARG start_ARG italic_n italic_β end_ARG ) roman_exp [ - divide start_ARG italic_t start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT - italic_τ end_ARG start_ARG italic_n italic_β end_ARG ]
×v(1v)α+1exp[2tτnβv(1v)]absent𝑣superscript1𝑣𝛼12superscript𝑡𝜏𝑛𝛽𝑣1𝑣\displaystyle\hskip 28.45274pt\times\frac{v}{(1-v)^{\alpha+1}}\exp\left[{-2% \frac{t^{\prime}-\tau}{n\beta}\frac{v}{(1-v)}}\right]× divide start_ARG italic_v end_ARG start_ARG ( 1 - italic_v ) start_POSTSUPERSCRIPT italic_α + 1 end_POSTSUPERSCRIPT end_ARG roman_exp [ - 2 divide start_ARG italic_t start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT - italic_τ end_ARG start_ARG italic_n italic_β end_ARG divide start_ARG italic_v end_ARG start_ARG ( 1 - italic_v ) end_ARG ]
×Θ(tτβ)dt.absentΘsuperscript𝑡𝜏𝛽dsuperscript𝑡\displaystyle\hskip 56.9055pt\times\Theta\left(\frac{t^{\prime}-\tau}{\beta}% \right)\,{\rm d}t^{\prime}.× roman_Θ ( divide start_ARG italic_t start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT - italic_τ end_ARG start_ARG italic_β end_ARG ) roman_d italic_t start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT . (44)

This expression enables us to obtain different shapelets orders by rewriting the relationship between the left and right components of Eq. (44) as follows:

Δνgν0(t,τ,n,β,A)0tg(t)c𝑑tΔsubscript𝜈𝑔subscript𝜈0𝑡𝜏𝑛𝛽𝐴superscriptsubscript0𝑡𝑔superscript𝑡𝑐differential-dsuperscript𝑡\displaystyle\frac{\Delta\nu_{g}}{\nu_{0}}(t,\vec{\tau},\vec{n},\vec{\beta},% \vec{A})\equiv\int_{0}^{t}\frac{g(t^{\prime})}{c}\,dt^{\prime}divide start_ARG roman_Δ italic_ν start_POSTSUBSCRIPT italic_g end_POSTSUBSCRIPT end_ARG start_ARG italic_ν start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_ARG ( italic_t , over→ start_ARG italic_τ end_ARG , over→ start_ARG italic_n end_ARG , over→ start_ARG italic_β end_ARG , over→ start_ARG italic_A end_ARG ) ≡ ∫ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT divide start_ARG italic_g ( italic_t start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) end_ARG start_ARG italic_c end_ARG italic_d italic_t start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT
=i=0P1ni![dnidvniG(t,τi,ni,βi,Ai,v)]|v=0.\displaystyle\hskip 28.45274pt=\sum_{i=0}^{P}\frac{1}{n_{i}!}\left[\frac{{\rm d% }^{n_{i}}}{{\rm d}v^{n_{i}}}G(t,\tau_{i},n_{i},\beta_{i},A_{i},v)\right]_{|v=0}.= ∑ start_POSTSUBSCRIPT italic_i = 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_P end_POSTSUPERSCRIPT divide start_ARG 1 end_ARG start_ARG italic_n start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ! end_ARG [ divide start_ARG roman_d start_POSTSUPERSCRIPT italic_n start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_POSTSUPERSCRIPT end_ARG start_ARG roman_d italic_v start_POSTSUPERSCRIPT italic_n start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_POSTSUPERSCRIPT end_ARG italic_G ( italic_t , italic_τ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT , italic_n start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT , italic_β start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT , italic_A start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT , italic_v ) ] start_POSTSUBSCRIPT | italic_v = 0 end_POSTSUBSCRIPT . (45)

Eq. (44) can be explicitely written as:

G(t,τ,n,β,A,v)=2vAc(1v)α+1nβ𝐺𝑡𝜏𝑛𝛽𝐴𝑣2𝑣𝐴𝑐superscript1𝑣𝛼1𝑛𝛽\displaystyle G(t,\tau,n,\beta,A,v)=\frac{2vA}{c(1-v)^{\alpha+1}}n\betaitalic_G ( italic_t , italic_τ , italic_n , italic_β , italic_A , italic_v ) = divide start_ARG 2 italic_v italic_A end_ARG start_ARG italic_c ( 1 - italic_v ) start_POSTSUPERSCRIPT italic_α + 1 end_POSTSUPERSCRIPT end_ARG italic_n italic_β
×τnβtτnβueue2uv1vΘ(nu)du\displaystyle\hskip 56.9055pt\times\int_{-\frac{\tau}{n\beta}}^{\frac{t-\tau}{% n\beta}}ue^{-u}e^{\frac{-2uv}{1-v}}\Theta(nu)\,{\rm d}u× ∫ start_POSTSUBSCRIPT - divide start_ARG italic_τ end_ARG start_ARG italic_n italic_β end_ARG end_POSTSUBSCRIPT start_POSTSUPERSCRIPT divide start_ARG italic_t - italic_τ end_ARG start_ARG italic_n italic_β end_ARG end_POSTSUPERSCRIPT italic_u italic_e start_POSTSUPERSCRIPT - italic_u end_POSTSUPERSCRIPT italic_e start_POSTSUPERSCRIPT divide start_ARG - 2 italic_u italic_v end_ARG start_ARG 1 - italic_v end_ARG end_POSTSUPERSCRIPT roman_Θ ( italic_n italic_u ) roman_d italic_u
=2vnAβ(1v)αc(1+v)2×Θ(tτβ)absent2𝑣𝑛𝐴𝛽superscript1𝑣𝛼𝑐superscript1𝑣2Θ𝑡𝜏𝛽\displaystyle\hskip 28.45274pt=-\frac{2vnA\beta(1-v)^{-\alpha}}{c(1+v)^{2}}% \times\Theta\bigg{(}\frac{t-\tau}{\beta}\bigg{)}= - divide start_ARG 2 italic_v italic_n italic_A italic_β ( 1 - italic_v ) start_POSTSUPERSCRIPT - italic_α end_POSTSUPERSCRIPT end_ARG start_ARG italic_c ( 1 + italic_v ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG × roman_Θ ( divide start_ARG italic_t - italic_τ end_ARG start_ARG italic_β end_ARG )
×(1+v+e(1+v)(tτ)n(1+v)β(t+tv+nβnvβ(1+v)τ)nβ).absent1𝑣superscript𝑒1𝑣𝑡𝜏𝑛1𝑣𝛽𝑡𝑡𝑣𝑛𝛽𝑛𝑣𝛽1𝑣𝜏𝑛𝛽\displaystyle\times\bigg{(}-1+v+\frac{e^{\frac{(1+v)(t-\tau)}{n(-1+v)\beta}}(t% +tv+n\beta-nv\beta-(1+v)\tau)}{n\beta}\bigg{)}.× ( - 1 + italic_v + divide start_ARG italic_e start_POSTSUPERSCRIPT divide start_ARG ( 1 + italic_v ) ( italic_t - italic_τ ) end_ARG start_ARG italic_n ( - 1 + italic_v ) italic_β end_ARG end_POSTSUPERSCRIPT ( italic_t + italic_t italic_v + italic_n italic_β - italic_n italic_v italic_β - ( 1 + italic_v ) italic_τ ) end_ARG start_ARG italic_n italic_β end_ARG ) . (46)

In this way, the generating function is used to perform the integral. Equation 46 needs to be computed for each value of Ai,τisubscript𝐴𝑖subscript𝜏𝑖A_{i},\tau_{i}italic_A start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT , italic_τ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT and βisubscript𝛽𝑖\beta_{i}italic_β start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT that is needed such that the sum over the shapelets can be performed.

We can now derive the Fourier transform of Δvgν0(t,τ,n,β,A)Δsubscript𝑣𝑔subscript𝜈0𝑡𝜏𝑛𝛽𝐴\frac{\Delta v_{g}}{\nu_{0}}(t,\vec{\tau},\vec{n},\vec{\beta},\vec{A})divide start_ARG roman_Δ italic_v start_POSTSUBSCRIPT italic_g end_POSTSUBSCRIPT end_ARG start_ARG italic_ν start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_ARG ( italic_t , over→ start_ARG italic_τ end_ARG , over→ start_ARG italic_n end_ARG , over→ start_ARG italic_β end_ARG , over→ start_ARG italic_A end_ARG ). This reads:

Δν~gν0(f,τ,n,β,A)=Δsubscript~𝜈𝑔subscript𝜈0𝑓𝜏𝑛𝛽𝐴absent\displaystyle\frac{\Delta\tilde{\nu}_{g}}{\nu_{0}}(f,\vec{\tau},\vec{n},\vec{% \beta},\vec{A})=divide start_ARG roman_Δ over~ start_ARG italic_ν end_ARG start_POSTSUBSCRIPT italic_g end_POSTSUBSCRIPT end_ARG start_ARG italic_ν start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_ARG ( italic_f , over→ start_ARG italic_τ end_ARG , over→ start_ARG italic_n end_ARG , over→ start_ARG italic_β end_ARG , over→ start_ARG italic_A end_ARG ) = (47)
i=0P1ni![dnidvniG~(f,τi,ni,βi,Ai,v)]|v=0.\displaystyle\hskip 28.45274pt\sum_{i=0}^{P}\frac{1}{n_{i}!}\left[\frac{{\rm d% }^{n_{i}}}{{\rm d}v^{n_{i}}}\tilde{G}(f,\tau_{i},n_{i},\beta_{i},A_{i},v)% \right]_{|v=0}.∑ start_POSTSUBSCRIPT italic_i = 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_P end_POSTSUPERSCRIPT divide start_ARG 1 end_ARG start_ARG italic_n start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ! end_ARG [ divide start_ARG roman_d start_POSTSUPERSCRIPT italic_n start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_POSTSUPERSCRIPT end_ARG start_ARG roman_d italic_v start_POSTSUPERSCRIPT italic_n start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_POSTSUPERSCRIPT end_ARG over~ start_ARG italic_G end_ARG ( italic_f , italic_τ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT , italic_n start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT , italic_β start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT , italic_A start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT , italic_v ) ] start_POSTSUBSCRIPT | italic_v = 0 end_POSTSUBSCRIPT . (48)

Considering Eq. 46, we obtain:

G~(f,τ,n,\displaystyle\tilde{G}(f,\tau,n,over~ start_ARG italic_G end_ARG ( italic_f , italic_τ , italic_n , β,A,v)=\displaystyle\beta,A,v)=italic_β , italic_A , italic_v ) =
iAn(1v)1αvβe2ifπτcfπ(i(1+v)+2fnπ(1+v)β)2,𝑖𝐴𝑛superscript1𝑣1𝛼𝑣𝛽superscript𝑒2𝑖𝑓𝜋𝜏𝑐𝑓𝜋superscript𝑖1𝑣2𝑓𝑛𝜋1𝑣𝛽2\displaystyle\frac{iAn(1-v)^{1-\alpha}v\beta e^{-2if\pi\tau}}{cf\pi(i(1+v)+2fn% \pi(-1+v)\beta)^{2}},divide start_ARG italic_i italic_A italic_n ( 1 - italic_v ) start_POSTSUPERSCRIPT 1 - italic_α end_POSTSUPERSCRIPT italic_v italic_β italic_e start_POSTSUPERSCRIPT - 2 italic_i italic_f italic_π italic_τ end_POSTSUPERSCRIPT end_ARG start_ARG italic_c italic_f italic_π ( italic_i ( 1 + italic_v ) + 2 italic_f italic_n italic_π ( - 1 + italic_v ) italic_β ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG , (49)

and therefore,

Δν~gν0(f,τ,n,β,A)=Δsubscript~𝜈𝑔subscript𝜈0𝑓𝜏𝑛𝛽𝐴absent\displaystyle\frac{\Delta\tilde{\nu}_{g}}{\nu_{0}}(f,\vec{\tau},\vec{n},\vec{% \beta},\vec{A})=\noindentdivide start_ARG roman_Δ over~ start_ARG italic_ν end_ARG start_POSTSUBSCRIPT italic_g end_POSTSUBSCRIPT end_ARG start_ARG italic_ν start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_ARG ( italic_f , over→ start_ARG italic_τ end_ARG , over→ start_ARG italic_n end_ARG , over→ start_ARG italic_β end_ARG , over→ start_ARG italic_A end_ARG ) = (50)
i=0P1ni![dnidvni(iAn(1v)1αvβe2ifπτcfπ(i(1+v)+2fnπ(1+v)β)2)]|v=0.\displaystyle\sum_{i=0}^{P}\frac{1}{n_{i}!}\left[\frac{{\rm d}^{n_{i}}}{{\rm d% }v^{n_{i}}}\bigg{(}\frac{iAn(1-v)^{1-\alpha}v\beta e^{-2if\pi\tau}}{cf\pi(i(1+% v)+2fn\pi(-1+v)\beta)^{2}}\bigg{)}\right]_{|v=0}.∑ start_POSTSUBSCRIPT italic_i = 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_P end_POSTSUPERSCRIPT divide start_ARG 1 end_ARG start_ARG italic_n start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ! end_ARG [ divide start_ARG roman_d start_POSTSUPERSCRIPT italic_n start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_POSTSUPERSCRIPT end_ARG start_ARG roman_d italic_v start_POSTSUPERSCRIPT italic_n start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_POSTSUPERSCRIPT end_ARG ( divide start_ARG italic_i italic_A italic_n ( 1 - italic_v ) start_POSTSUPERSCRIPT 1 - italic_α end_POSTSUPERSCRIPT italic_v italic_β italic_e start_POSTSUPERSCRIPT - 2 italic_i italic_f italic_π italic_τ end_POSTSUPERSCRIPT end_ARG start_ARG italic_c italic_f italic_π ( italic_i ( 1 + italic_v ) + 2 italic_f italic_n italic_π ( - 1 + italic_v ) italic_β ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG ) ] start_POSTSUBSCRIPT | italic_v = 0 end_POSTSUBSCRIPT . (51)

The zero order shapelet (i=0𝑖0i=0italic_i = 0) in the frequency domain is

Δν~gν0(f,τ0,β0,A0)=iA0β0e2ifπτ0cfπ(i2fπβ0)2.Δsubscript~𝜈𝑔subscript𝜈0𝑓subscript𝜏0subscript𝛽0subscript𝐴0𝑖subscript𝐴0subscript𝛽0superscript𝑒2𝑖𝑓𝜋subscript𝜏0𝑐𝑓𝜋superscript𝑖2𝑓𝜋subscript𝛽02\frac{\Delta\tilde{\nu}_{g}}{\nu_{0}}(f,\tau_{0},\beta_{0},A_{0})=\frac{iA_{0}% \beta_{0}e^{2if\pi\tau_{0}}}{cf\pi(i-2f\pi\beta_{0})^{2}}.divide start_ARG roman_Δ over~ start_ARG italic_ν end_ARG start_POSTSUBSCRIPT italic_g end_POSTSUBSCRIPT end_ARG start_ARG italic_ν start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_ARG ( italic_f , italic_τ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , italic_β start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , italic_A start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) = divide start_ARG italic_i italic_A start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT italic_β start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT italic_e start_POSTSUPERSCRIPT 2 italic_i italic_f italic_π italic_τ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_POSTSUPERSCRIPT end_ARG start_ARG italic_c italic_f italic_π ( italic_i - 2 italic_f italic_π italic_β start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG . (52)

A.2 Glitch model in time and frequency domain and respective TDI transfer functions

The effect of a typical LPF glitch is evaluated starting from a model for its differential acceleration, Δg(t)Δ𝑔𝑡\Delta g(t)roman_Δ italic_g ( italic_t ), Sala (2023); Muratore (2021):

hg(t)=Δvτ2tetτΘ(t),t=tt0h_{g}(t)=\frac{\Delta v}{\tau^{2}}t{{}^{\prime}}e^{-\frac{t{{}^{\prime}}}{\tau% }}\Theta(t^{\prime}),\quad\quad t{{}^{\prime}}=t-t_{0}italic_h start_POSTSUBSCRIPT italic_g end_POSTSUBSCRIPT ( italic_t ) = divide start_ARG roman_Δ italic_v end_ARG start_ARG italic_τ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG italic_t start_FLOATSUPERSCRIPT ′ end_FLOATSUPERSCRIPT italic_e start_POSTSUPERSCRIPT - divide start_ARG italic_t start_FLOATSUPERSCRIPT ′ end_FLOATSUPERSCRIPT end_ARG start_ARG italic_τ end_ARG end_POSTSUPERSCRIPT roman_Θ ( italic_t start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) , italic_t start_FLOATSUPERSCRIPT ′ end_FLOATSUPERSCRIPT = italic_t - italic_t start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT (53)

where t0subscript𝑡0t_{0}italic_t start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT is the occurrence time, Θ(t)Θ𝑡\Theta(t)roman_Θ ( italic_t ) is the Heaviside step function, ΔvΔ𝑣\Delta vroman_Δ italic_v is the total transferred impulse per unit mass, and the τ𝜏\tauitalic_τ parameter determines the duration of the event. We can also express the fractional frequency shift from the glitch as:

Δνg(t)ν0=Δvc(1et+Tτ(tT+τ)τ)Θ(tt0).Δsubscript𝜈𝑔𝑡subscript𝜈0Δ𝑣𝑐1superscript𝑒𝑡𝑇𝜏𝑡𝑇𝜏𝜏Θ𝑡subscript𝑡0\frac{\Delta\nu_{g}(t)}{\nu_{0}}=\frac{\Delta v}{c}\Bigg{(}1-\frac{e^{\frac{-t% +T}{\tau}}(t-T+\tau)}{\tau}\Bigg{)}\Theta(t-t_{0}).divide start_ARG roman_Δ italic_ν start_POSTSUBSCRIPT italic_g end_POSTSUBSCRIPT ( italic_t ) end_ARG start_ARG italic_ν start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_ARG = divide start_ARG roman_Δ italic_v end_ARG start_ARG italic_c end_ARG ( 1 - divide start_ARG italic_e start_POSTSUPERSCRIPT divide start_ARG - italic_t + italic_T end_ARG start_ARG italic_τ end_ARG end_POSTSUPERSCRIPT ( italic_t - italic_T + italic_τ ) end_ARG start_ARG italic_τ end_ARG ) roman_Θ ( italic_t - italic_t start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) . (54)

From a frequency-domain perspective, ΔvΔ𝑣\Delta vroman_Δ italic_v is also the zero-frequency limit of the Fourier transform of the hg(t)subscript𝑔𝑡h_{g}(t)italic_h start_POSTSUBSCRIPT italic_g end_POSTSUBSCRIPT ( italic_t ) template. This is relevant, since it implies that it has a strong low-frequency component. The Fourier transform of the template is:

Δνg(f)ν0=Δvei2πft0ci2πf(i+2πfτ)2,Δsubscript𝜈𝑔𝑓subscript𝜈0Δ𝑣superscript𝑒𝑖2𝜋𝑓subscript𝑡0𝑐𝑖2𝜋𝑓superscript𝑖2𝜋𝑓𝜏2\frac{\Delta\nu_{g}(f)}{\nu_{0}}=-\frac{\Delta ve^{-i2\pi ft_{0}}}{ci2\pi f% \left(-i+2\pi f\tau\right)^{2}},divide start_ARG roman_Δ italic_ν start_POSTSUBSCRIPT italic_g end_POSTSUBSCRIPT ( italic_f ) end_ARG start_ARG italic_ν start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_ARG = - divide start_ARG roman_Δ italic_v italic_e start_POSTSUPERSCRIPT - italic_i 2 italic_π italic_f italic_t start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_POSTSUPERSCRIPT end_ARG start_ARG italic_c italic_i 2 italic_π italic_f ( - italic_i + 2 italic_π italic_f italic_τ ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG , (55)

where we need to divide by cω𝑐𝜔c\omegaitalic_c italic_ω to express it as fractional frequency shift. The expression is similar to the shapelet when n0=1subscript𝑛01n_{0}=1italic_n start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT = 1 (Eq. 10) with Δv=2A0β0Δ𝑣2subscript𝐴0subscript𝛽0\Delta v=2A_{0}\beta_{0}roman_Δ italic_v = 2 italic_A start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT italic_β start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT and t0=τ0subscript𝑡0subscript𝜏0t_{0}=\tau_{0}italic_t start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT = italic_τ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT and τ=β0𝜏subscript𝛽0\tau=\beta_{0}italic_τ = italic_β start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT.

Considering now Eq. III.1 in section III.3, we can model the TDI response to a glitch as a product between the glitch expression in the frequency domain (Eq. 55) and the transfer function 𝒯X,Y,Zg(f)superscriptsubscript𝒯𝑋𝑌𝑍𝑔𝑓\mathcal{T}_{X,Y,Z}^{g}(f)caligraphic_T start_POSTSUBSCRIPT italic_X , italic_Y , italic_Z end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_g end_POSTSUPERSCRIPT ( italic_f ) which takes into account TDI (see section III.3). We provide below the expressions for the first-generation TDI transfer functions, noting Dij=ei(2πf)Lsubscript𝐷𝑖𝑗superscript𝑒𝑖2𝜋𝑓𝐿\mathcal{F}{D_{ij}}=e^{-i(2\pi f)L}caligraphic_F italic_D start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT = italic_e start_POSTSUPERSCRIPT - italic_i ( 2 italic_π italic_f ) italic_L end_POSTSUPERSCRIPT:

In case of a glitch happening on TM21𝑇subscript𝑀21TM_{21}italic_T italic_M start_POSTSUBSCRIPT 21 end_POSTSUBSCRIPT we have:

𝒯X1g(f)superscriptsubscript𝒯subscript𝑋1𝑔𝑓\displaystyle\mathcal{T}_{X_{1}}^{g}(f)caligraphic_T start_POSTSUBSCRIPT italic_X start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_g end_POSTSUPERSCRIPT ( italic_f ) =2e3iL(2πf)(1+e2iL(2πf))absent2superscript𝑒3𝑖𝐿2𝜋𝑓1superscript𝑒2𝑖𝐿2𝜋𝑓\displaystyle=-2e^{-3iL(2\pi f)}(-1+e^{2iL(2\pi f)})= - 2 italic_e start_POSTSUPERSCRIPT - 3 italic_i italic_L ( 2 italic_π italic_f ) end_POSTSUPERSCRIPT ( - 1 + italic_e start_POSTSUPERSCRIPT 2 italic_i italic_L ( 2 italic_π italic_f ) end_POSTSUPERSCRIPT ) (56a)
𝒯Y1g(f)superscriptsubscript𝒯subscript𝑌1𝑔𝑓\displaystyle\mathcal{T}_{Y_{1}}^{g}(f)caligraphic_T start_POSTSUBSCRIPT italic_Y start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_g end_POSTSUPERSCRIPT ( italic_f ) =1e4iL(2πf)absent1superscript𝑒4𝑖𝐿2𝜋𝑓\displaystyle=1-e^{-4iL(2\pi f)}= 1 - italic_e start_POSTSUPERSCRIPT - 4 italic_i italic_L ( 2 italic_π italic_f ) end_POSTSUPERSCRIPT (56b)
𝒯Z1g(f)superscriptsubscript𝒯subscript𝑍1𝑔𝑓\displaystyle\mathcal{T}_{Z_{1}}^{g}(f)caligraphic_T start_POSTSUBSCRIPT italic_Z start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_g end_POSTSUPERSCRIPT ( italic_f ) =0absent0\displaystyle=0= 0 (56c)

In case the same glitch is happening on TM13𝑇subscript𝑀13TM_{13}italic_T italic_M start_POSTSUBSCRIPT 13 end_POSTSUBSCRIPT

𝒯X1g(f)superscriptsubscript𝒯subscript𝑋1𝑔𝑓\displaystyle\mathcal{T}_{X_{1}}^{g}(f)caligraphic_T start_POSTSUBSCRIPT italic_X start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_g end_POSTSUPERSCRIPT ( italic_f ) =1e4iL(2πf)absent1superscript𝑒4𝑖𝐿2𝜋𝑓\displaystyle=1-e^{-4iL(2\pi f)}= 1 - italic_e start_POSTSUPERSCRIPT - 4 italic_i italic_L ( 2 italic_π italic_f ) end_POSTSUPERSCRIPT (57a)
𝒯Y1g(f)superscriptsubscript𝒯subscript𝑌1𝑔𝑓\displaystyle\mathcal{T}_{Y_{1}}^{g}(f)caligraphic_T start_POSTSUBSCRIPT italic_Y start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_g end_POSTSUPERSCRIPT ( italic_f ) =0absent0\displaystyle=0= 0 (57b)
𝒯Z1g(f)superscriptsubscript𝒯subscript𝑍1𝑔𝑓\displaystyle\mathcal{T}_{Z_{1}}^{g}(f)caligraphic_T start_POSTSUBSCRIPT italic_Z start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_g end_POSTSUPERSCRIPT ( italic_f ) =2e3iL(2πf)(1+e2iL(2πf))absent2superscript𝑒3𝑖𝐿2𝜋𝑓1superscript𝑒2𝑖𝐿2𝜋𝑓\displaystyle=-2e^{-3iL(2\pi f)}(-1+e^{2iL(2\pi f)})= - 2 italic_e start_POSTSUPERSCRIPT - 3 italic_i italic_L ( 2 italic_π italic_f ) end_POSTSUPERSCRIPT ( - 1 + italic_e start_POSTSUPERSCRIPT 2 italic_i italic_L ( 2 italic_π italic_f ) end_POSTSUPERSCRIPT ) (57c)

or on TM31𝑇subscript𝑀31TM_{31}italic_T italic_M start_POSTSUBSCRIPT 31 end_POSTSUBSCRIPT

𝒯X1g(f)superscriptsubscript𝒯subscript𝑋1𝑔𝑓\displaystyle\mathcal{T}_{X_{1}}^{g}(f)caligraphic_T start_POSTSUBSCRIPT italic_X start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_g end_POSTSUPERSCRIPT ( italic_f ) =4ie2iL(2πf)sin(L(2πf))absent4𝑖superscript𝑒2𝑖𝐿2𝜋𝑓𝐿2𝜋𝑓\displaystyle=4ie^{-2iL(2\pi f)}\sin(L(2\pi f))= 4 italic_i italic_e start_POSTSUPERSCRIPT - 2 italic_i italic_L ( 2 italic_π italic_f ) end_POSTSUPERSCRIPT roman_sin ( italic_L ( 2 italic_π italic_f ) ) (58a)
𝒯Y1g(f)superscriptsubscript𝒯subscript𝑌1𝑔𝑓\displaystyle\mathcal{T}_{Y_{1}}^{g}(f)caligraphic_T start_POSTSUBSCRIPT italic_Y start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_g end_POSTSUPERSCRIPT ( italic_f ) =0absent0\displaystyle=0= 0 (58b)
𝒯Z1g(f)superscriptsubscript𝒯subscript𝑍1𝑔𝑓\displaystyle\mathcal{T}_{Z_{1}}^{g}(f)caligraphic_T start_POSTSUBSCRIPT italic_Z start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_g end_POSTSUPERSCRIPT ( italic_f ) =1+e4iL(2πf)absent1superscript𝑒4𝑖𝐿2𝜋𝑓\displaystyle=-1+e^{-4iL(2\pi f)}= - 1 + italic_e start_POSTSUPERSCRIPT - 4 italic_i italic_L ( 2 italic_π italic_f ) end_POSTSUPERSCRIPT (58c)

or on TM23𝑇subscript𝑀23TM_{23}italic_T italic_M start_POSTSUBSCRIPT 23 end_POSTSUBSCRIPT

𝒯X1g(f)superscriptsubscript𝒯subscript𝑋1𝑔𝑓\displaystyle\mathcal{T}_{X_{1}}^{g}(f)caligraphic_T start_POSTSUBSCRIPT italic_X start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_g end_POSTSUPERSCRIPT ( italic_f ) =0absent0\displaystyle=0= 0 (59a)
𝒯Y1g(f)superscriptsubscript𝒯subscript𝑌1𝑔𝑓\displaystyle\mathcal{T}_{Y_{1}}^{g}(f)caligraphic_T start_POSTSUBSCRIPT italic_Y start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_g end_POSTSUPERSCRIPT ( italic_f ) =1+e4iL(2πf)absent1superscript𝑒4𝑖𝐿2𝜋𝑓\displaystyle=-1+e^{-4iL(2\pi f)}= - 1 + italic_e start_POSTSUPERSCRIPT - 4 italic_i italic_L ( 2 italic_π italic_f ) end_POSTSUPERSCRIPT (59b)
𝒯Z1g(f)superscriptsubscript𝒯subscript𝑍1𝑔𝑓\displaystyle\mathcal{T}_{Z_{1}}^{g}(f)caligraphic_T start_POSTSUBSCRIPT italic_Z start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_g end_POSTSUPERSCRIPT ( italic_f ) =4ie2iL(2πf)sin(L(2πf))absent4𝑖superscript𝑒2𝑖𝐿2𝜋𝑓𝐿2𝜋𝑓\displaystyle=4ie^{-2iL(2\pi f)}\sin(L(2\pi f))= 4 italic_i italic_e start_POSTSUPERSCRIPT - 2 italic_i italic_L ( 2 italic_π italic_f ) end_POSTSUPERSCRIPT roman_sin ( italic_L ( 2 italic_π italic_f ) ) (59c)

and finally in case of TM23𝑇subscript𝑀23TM_{23}italic_T italic_M start_POSTSUBSCRIPT 23 end_POSTSUBSCRIPT:

𝒯X1g(f)superscriptsubscript𝒯subscript𝑋1𝑔𝑓\displaystyle\mathcal{T}_{X_{1}}^{g}(f)caligraphic_T start_POSTSUBSCRIPT italic_X start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_g end_POSTSUPERSCRIPT ( italic_f ) =0absent0\displaystyle=0= 0 (60a)
𝒯Y1g(f)superscriptsubscript𝒯subscript𝑌1𝑔𝑓\displaystyle\mathcal{T}_{Y_{1}}^{g}(f)caligraphic_T start_POSTSUBSCRIPT italic_Y start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_g end_POSTSUPERSCRIPT ( italic_f ) =2e3iL(2πf)(1+e2iL(2πf))absent2superscript𝑒3𝑖𝐿2𝜋𝑓1superscript𝑒2𝑖𝐿2𝜋𝑓\displaystyle=-2e^{-3iL(2\pi f)}(-1+e^{2iL(2\pi f)})= - 2 italic_e start_POSTSUPERSCRIPT - 3 italic_i italic_L ( 2 italic_π italic_f ) end_POSTSUPERSCRIPT ( - 1 + italic_e start_POSTSUPERSCRIPT 2 italic_i italic_L ( 2 italic_π italic_f ) end_POSTSUPERSCRIPT ) (60b)
𝒯Z1g(f)superscriptsubscript𝒯subscript𝑍1𝑔𝑓\displaystyle\mathcal{T}_{Z_{1}}^{g}(f)caligraphic_T start_POSTSUBSCRIPT italic_Z start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_g end_POSTSUPERSCRIPT ( italic_f ) =1e4iL(2πf)absent1superscript𝑒4𝑖𝐿2𝜋𝑓\displaystyle=1-e^{-4iL(2\pi f)}= 1 - italic_e start_POSTSUPERSCRIPT - 4 italic_i italic_L ( 2 italic_π italic_f ) end_POSTSUPERSCRIPT (60c)

For the second generation TDI a similar computation can be done using Eq. 3 in section III.3.

Appendix B Posteriors of SNR 1152 MBHB with high SNR glitch of 1544 at different injection times with respect to its merger time

The impact of a glitch in the recovery of a MBHB is maximised when the glitch occurs closer to the merger; this, in practice, means that the same type of glitch would not have as great an impact if it happened at a different time. To show this, we also report below the impact on parameter estimation of the fiducial MBHB with SNR 1152 and redshift 4 – considered also in the analysis of the “Light-Spritz” data set – with the 1544 SNR glitch injected at different times with respect to maximum overlap computed in section VI: five hours before the merger and five minutes after the merger (Fig. 24). In all cases, noisy data have been used, which introduce some random statistical fluctuations that are visible in the posteriors.

Looking at Fig. 24, where the glitch has been injected five hours before the merger time, we can see that not fitting for the glitch causes less bias with respect to the glitch occurring five minutes after the merger. We expect that glitches happening earlier (or later with respect to the merger time) would have an even smaller effect. If the glitch occurs closer to or at the merger, not fitting for it clearly biases the parameters of the MBHB, such as the masses, spins or sky positions. This behaviour stems from the fact that the merger is the loudest part of the signal, and drives the parameter estimation of the MBHB, so glitches that corrupt that part of the signal have a stronger effect.

Refer to caption
Figure 24: Posteriors of MBHB 1152 SNR along with posterior of the same MBHB with a 1544 SNR glitch (##\## 248) injected five minutes after the merger time and injected five hours before the merger time

Appendix C Posteriors of SNR 320 MBHB with glitches N41𝑁41N41italic_N 41 and N3𝑁3N3italic_N 3 injected at their maximum overlap

In Fig. 10 are reported the posteriors of an SNR 320 MBHB when the two glitches corresponding to glitch #41#41\#41# 41 and #3#3\#3# 3 of the LPF catalogue are injected. The parameters of the MBHB are reported in Table 8. The glitches were injected at their maximum overlap, that is 0.19[h]0.19delimited-[]0.19[h]0.19 [ italic_h ] after the merger for glitch #41#41\#41# 41, and 0.11[h]0.11delimited-[]0.11[h]0.11 [ italic_h ] after the merger for glitch #3#3\#3# 3, as can be seen from the Figure 25. It is possible to observe that sky parameters have multimodality. Indeed, there are up to eight modes in the sky that are exactly degenerate for a fixed LISA configuration and non-evolving sources. However, this degeneracy is broken by LISA’s motion and changes in frequency. High-mass and short-duration sources, like this case, are more likely to exhibit such degeneracies. The degeneracy primarily involves sky location and polarization (βdsubscript𝛽𝑑\beta_{d}italic_β start_POSTSUBSCRIPT italic_d end_POSTSUBSCRIPT, ψ𝜓\psiitalic_ψ, λdsubscript𝜆𝑑\lambda_{d}italic_λ start_POSTSUBSCRIPT italic_d end_POSTSUBSCRIPT and cos(ι)𝜄\cos(\iota)roman_cos ( italic_ι )Marsat et al. (2021); Katz (2022), as can also be seen in the posteriors plot showing the extrinsic parameters (right plot Fig. 26). The left plot in Fig. 26 shows the posteriors for the intrinsic parameters.

Parameter True Value
ln(Mt/M)𝑙𝑛subscript𝑀𝑡subscript𝑀direct-productln(M_{t}/M_{\odot})italic_l italic_n ( italic_M start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT / italic_M start_POSTSUBSCRIPT ⊙ end_POSTSUBSCRIPT ) 17.91
q𝑞qitalic_q 0.33
a1subscript𝑎1a_{1}italic_a start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT 0.3
a2subscript𝑎2a_{2}italic_a start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT 0.4
dL[Gpc]subscript𝑑𝐿delimited-[]𝐺𝑝𝑐d_{L}[Gpc]italic_d start_POSTSUBSCRIPT italic_L end_POSTSUBSCRIPT [ italic_G italic_p italic_c ] 47.6
φrefsubscript𝜑𝑟𝑒𝑓\varphi_{ref}italic_φ start_POSTSUBSCRIPT italic_r italic_e italic_f end_POSTSUBSCRIPT [rad]delimited-[]𝑟𝑎𝑑[rad][ italic_r italic_a italic_d ] 1.2
cosι𝜄\cos\iotaroman_cos italic_ι 0.29
λ𝜆\lambdaitalic_λ [rad]delimited-[]𝑟𝑎𝑑[rad][ italic_r italic_a italic_d ] 2.0
sinβ𝛽\sin\betaroman_sin italic_β 0.82
ψ𝜓\psiitalic_ψ [rad]delimited-[]𝑟𝑎𝑑[rad][ italic_r italic_a italic_d ] 1.6
tref[h]subscript𝑡𝑟𝑒𝑓delimited-[]t_{ref}[h]italic_t start_POSTSUBSCRIPT italic_r italic_e italic_f end_POSTSUBSCRIPT [ italic_h ] 30
Table 8: Massive Black Hole Binary true values. Values of the masses are given in the detector frame.
Refer to caption
Figure 25: Overlap between the MBHB and glitch signals as a function of the glitch injection time relative to the merger, where the decay time and amplitude of the glitches are kept fixed. The red and blue solid curves are for the two glitches with SNRs of 21 and 72, respectively. The GW signal is fixed to that of the parameters in Table 8. Dashed vertical lines with matching colours denote the times that maximise the overlap.
Refer to caption
Refer to caption
Figure 26: Posteriors of a subsets of intrinsic parameters (left plot) and a subsets of extrinsic parameter (right plot) for the parameters of an MBHB with SNR 320 obtained from data containing only the MBHB, compared to those obtained from data containing a glitch of SNR 21 and of SNR 72, when the latter are not fitted.

Appendix D Discussion on posteriors for noise estimated during the search phase

Fig. 27 shows the posterior distributions for the three noise parameters (N1subscript𝑁1N_{1}italic_N start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT, N2subscript𝑁2N_{2}italic_N start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT, and NTMsubscript𝑁TMN_{\mathrm{TM}}italic_N start_POSTSUBSCRIPT roman_TM end_POSTSUBSCRIPT), estimated during the search for segments N0, N2, N5, N8, N10, N13, N15, N20, and N26, along with the true values reported in section VII.2. The TM acceleration noise is not well characterized, showing a bias away from the true value. This bias increases for data chunks that include the late inspiral phase of the MBHB (segments N26–N28), as this occurs in the frequency range around 103superscript10310^{-3}10 start_POSTSUPERSCRIPT - 3 end_POSTSUPERSCRIPT Hz where TM noise dominates, as also visible in Fig. 28.
The N2subscript𝑁2N_{2}italic_N start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT component spans the entire prior range, indicating poor constraint. In contrast, N1subscript𝑁1N_{1}italic_N start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT is better estimated, as it dominates at higher frequencies. Even with just about a day of data, its value can be meaningfully constrained. On the other hand, the TMI noise remains poorly characterized, as its contribution lies in a frequency range where it is overshadowed by the stronger OMS noise, as discussed in section VII.2. Otherwise, in other words, the OMS noise that enters in N2subscript𝑁2N_{2}italic_N start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT is much smaller than the OMS noise that contributes to N1subscript𝑁1N_{1}italic_N start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT. Longer data segments are needed to improve the inference of TMI noise, as demonstrated in section VI, where the fit is performed on the full “Light-Spritz” dataset. Nonetheless, this initial estimate is useful as an indicator of a reasonable prior range to be used during the parameter estimation phase, where the full data set is analysed.

Refer to caption
Figure 27: Estimation of amplitude for the components of the test mass acceleration noise, test mass interferometer, inter spacecraft interferometer and reference interferometer
Refer to caption
Figure 28: Power spectral density of the TDI A𝐴Aitalic_A and E𝐸Eitalic_E estimated during the search phase for the stretch N26 where the inspiral-phase of the MBHB is present.

Appendix E Orthogonality of FFT of “Spritz”-like data.

In section VII.1, we propose to low-pass filter the “Light-Spritz” data before taking the FFT to better approximate a diagonal likelihood. To verify this approach, we produce a toy-model with simplified “Spritz”-like noise in the time domain and investigate the covariance matrix of the FFT. We use the same sampling rate as in “Spritz”, fs=0.2 Hzsubscript𝑓𝑠times0.2hertzf_{s}=$0.2\text{\,}\mathrm{Hz}$italic_f start_POSTSUBSCRIPT italic_s end_POSTSUBSCRIPT = start_ARG 0.2 end_ARG start_ARG times end_ARG start_ARG roman_Hz end_ARG, and produce 500 realizations of a noise data stream of 4000 samples. The noise in this toy-model is the sum of the two components,

ntoy=nTM+nOMS,subscript𝑛toysubscript𝑛TMsubscript𝑛OMSn_{\mathrm{toy}}=n_{\mathrm{TM}}+n_{\mathrm{OMS}},italic_n start_POSTSUBSCRIPT roman_toy end_POSTSUBSCRIPT = italic_n start_POSTSUBSCRIPT roman_TM end_POSTSUBSCRIPT + italic_n start_POSTSUBSCRIPT roman_OMS end_POSTSUBSCRIPT , (61)

where the noise PSDs of nTMsubscript𝑛TMn_{\mathrm{TM}}italic_n start_POSTSUBSCRIPT roman_TM end_POSTSUBSCRIPT and nOMSsubscript𝑛OMSn_{\mathrm{OMS}}italic_n start_POSTSUBSCRIPT roman_OMS end_POSTSUBSCRIPT are given by CaccffdSTMsuperscriptsubscript𝐶accffdsubscript𝑆TMC_{\mathrm{acc}}^{\mathrm{ffd}}S_{\mathrm{TM}}italic_C start_POSTSUBSCRIPT roman_acc end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_ffd end_POSTSUPERSCRIPT italic_S start_POSTSUBSCRIPT roman_TM end_POSTSUBSCRIPT and CdispffdSomsISIsuperscriptsubscript𝐶dispffdsuperscriptsubscript𝑆omsISIC_{\mathrm{disp}}^{\mathrm{ffd}}S_{\mathrm{oms}}^{\mathrm{ISI}}italic_C start_POSTSUBSCRIPT roman_disp end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_ffd end_POSTSUPERSCRIPT italic_S start_POSTSUBSCRIPT roman_oms end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_ISI end_POSTSUPERSCRIPT, as described in section VII.2.

We approximate these noise shapes by adding white noise time series, shaped using repeated applications of cumsum and gradient operations. We then inject this noise into the variable η12subscript𝜂12\eta_{12}italic_η start_POSTSUBSCRIPT 12 end_POSTSUBSCRIPT of the 2nd generation TDI Michelson combination (see Eq. (3)), evaluated with equals arms and computed using the open source software pytdi Staab et al. (2023). Expressed in terms of delay operators, this corresponds to

ntoyTDI=(1D2)(1D4)ntoy.superscriptsubscript𝑛toyTDI1superscript𝐷21superscript𝐷4subscript𝑛toyn_{\mathrm{toy}}^{\mathrm{TDI}}=(1-D^{2})(1-D^{4})n_{\mathrm{toy}}.italic_n start_POSTSUBSCRIPT roman_toy end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_TDI end_POSTSUPERSCRIPT = ( 1 - italic_D start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ) ( 1 - italic_D start_POSTSUPERSCRIPT 4 end_POSTSUPERSCRIPT ) italic_n start_POSTSUBSCRIPT roman_toy end_POSTSUBSCRIPT . (62)

We then consider three scenarios:

  1. 1.

    Directly compute the FFT without any pre-processing

  2. 2.

    Compute the FFT after applying a window function (hann window)

  3. 3.

    Compute the FFT after applying a low-pass filter, as described in section VII.1.

After computing the FFT series, we estimate the covariance matrix by computing

Cij=E[FFTiFFTj],subscript𝐶𝑖𝑗𝐸delimited-[]subscriptFFT𝑖superscriptsubscriptFFT𝑗C_{ij}=E[\mathrm{FFT}_{i}\mathrm{FFT}_{j}^{*}],italic_C start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT = italic_E [ roman_FFT start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT roman_FFT start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ] , (63)

where FFTisubscriptFFT𝑖\mathrm{FFT}_{i}roman_FFT start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT is the i’th frequency bin of the FFT of ntoyTDIsuperscriptsubscript𝑛toyTDIn_{\mathrm{toy}}^{\mathrm{TDI}}italic_n start_POSTSUBSCRIPT roman_toy end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_TDI end_POSTSUPERSCRIPT, considering only positive frequencies. The expectation value is approximated by averaging over our 500 realizations.

Refer to caption
Figure 29: Diagonal of FFT covariance matrix, scaled to ASD.

The square root of the diagonal elements of this matrix are displayed in Fig. 29, rescaled by 2fsw[i]22subscript𝑓𝑠𝑤superscriptdelimited-[]𝑖2\sqrt{\frac{2}{f_{s}\sum{w[i]^{2}}}}square-root start_ARG divide start_ARG 2 end_ARG start_ARG italic_f start_POSTSUBSCRIPT italic_s end_POSTSUBSCRIPT ∑ italic_w [ italic_i ] start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG end_ARG to approximate an ASD Heinzel et al. (2002). Here, w[i]𝑤delimited-[]𝑖w[i]italic_w [ italic_i ] are the window function coefficients applied to the i’th sample.

The direct FFT is strongly affected by spectral leakage and deviates from the expected ASD at low frequencies, whereas the windowed FFT and filtered FFT approximate their respecitve model well. Note that the low-pass filter is accounted for in the modelled noise shape for the latter.

To study correlations between frequency bins with the three methods, we normalize Cijsubscript𝐶𝑖𝑗C_{ij}italic_C start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT by the values on the diagonal and plot

C~ij=|Cij/CiiCjj|.subscript~𝐶𝑖𝑗subscript𝐶𝑖𝑗subscript𝐶𝑖𝑖subscript𝐶𝑗𝑗\tilde{C}_{ij}=|C_{ij}/\sqrt{C_{ii}C_{jj}}|.over~ start_ARG italic_C end_ARG start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT = | italic_C start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT / square-root start_ARG italic_C start_POSTSUBSCRIPT italic_i italic_i end_POSTSUBSCRIPT italic_C start_POSTSUBSCRIPT italic_j italic_j end_POSTSUBSCRIPT end_ARG | . (64)

The results are displayed in Fig. 30. For all three scenarios, we display the matrix for the full frequency range on the left as well as a zoomed in version of just the first 100 frequency bins on the right. The top row of Fig. 30 is the direct FFT. It shows strong long-range correlations around the frequencies corresponding to the zeros of the TDI transfer function as well as at DC, but otherwise a narrow diagonal. The middle row is the windowed FFT. The window strongly suppresses the long-range correlations, but introduces strong correlations in a narrow band along the diagonal. Finally, the filtered FFT shows even stronger correlations at high frequencies, but is perfectly diagonal in the band up to 20 mHztimes20millihertz20\text{\,}\mathrm{mHz}start_ARG 20 end_ARG start_ARG times end_ARG start_ARG roman_mHz end_ARG used for our analysis. This validates the use of a diagonal likelihood model in Fig. VII.1.

Refer to caption
Refer to caption
Refer to caption
Figure 30: Numerically estimated covariance matrix of FFT. Top: no pre-processing, middle: window function, bottom: low-pass filter

Appendix F Bayes factor computation

In Fig. 31, we report three examples of the stretches N6, N24 and N29 as an illustration of the methodology to compute the Bayes factor with thermodynamic integration. We use 70 temperatures to perform the thermodynamic integration method, where β=1kT𝛽1𝑘𝑇\beta=\frac{1}{kT}italic_β = divide start_ARG 1 end_ARG start_ARG italic_k italic_T end_ARG and T𝑇Titalic_T represents the temperatures. As expected Karnesis et al. (2023), the stepping-stone algorithm produces more accurate estimates of the marginal likelihood than methods that use thermodynamic integration Xie et al. (2011); this is why in the computation shown here, the errors on the Bayes factor are higher. The y-axis shows the average log-likelihood at each temperature for the MBHB (glitch) model, and the x-axis shows the inverse of the temperature. This curve is integrated to calculate the log evidence, logZ𝑍\log Zroman_log italic_Z. The difference between the log evidences of the MBHB model versus the glitch model gives the log Bayes factor, logMBHB/glitchsubscriptMBHBglitch\log\mathcal{B}_{\mathrm{MBHB}/\mathrm{glitch}}roman_log caligraphic_B start_POSTSUBSCRIPT roman_MBHB / roman_glitch end_POSTSUBSCRIPT.

Refer to caption
Refer to caption
Refer to caption
Refer to caption
Refer to caption
Refer to caption
Figure 31: Top row: logMBHB/glitch5.57×104subscriptMBHBglitch-5.57E4\log\mathcal{B}_{\mathrm{MBHB}/\mathrm{glitch}}\approx$-5.57\text{\times}{10}^% {4}$roman_log caligraphic_B start_POSTSUBSCRIPT roman_MBHB / roman_glitch end_POSTSUBSCRIPT ≈ start_ARG - 5.57 end_ARG start_ARG times end_ARG start_ARG power start_ARG 10 end_ARG start_ARG 4 end_ARG end_ARG with dlogMBHB/glitch=11843.7𝑑subscriptMBHBglitch11843.7d\log\mathcal{B}_{\mathrm{MBHB}/\mathrm{glitch}}=11843.7italic_d roman_log caligraphic_B start_POSTSUBSCRIPT roman_MBHB / roman_glitch end_POSTSUBSCRIPT = 11843.7 for stretch N6. Middle row: logMBHB/glitch2.93×104subscriptMBHBglitch-2.93E4\log\mathcal{B}_{\mathrm{MBHB}/\mathrm{glitch}}\approx$-2.93\text{\times}{10}^% {4}$roman_log caligraphic_B start_POSTSUBSCRIPT roman_MBHB / roman_glitch end_POSTSUBSCRIPT ≈ start_ARG - 2.93 end_ARG start_ARG times end_ARG start_ARG power start_ARG 10 end_ARG start_ARG 4 end_ARG end_ARG with dlogMBHB/glitch=55796.5𝑑subscriptMBHBglitch55796.5d\log\mathcal{B}_{\mathrm{MBHB}/\mathrm{glitch}}=55796.5italic_d roman_log caligraphic_B start_POSTSUBSCRIPT roman_MBHB / roman_glitch end_POSTSUBSCRIPT = 55796.5 for stretch N24. Bottom row: logMBHB/glitch1.56×107subscriptMBHBglitch1.56E7\log\mathcal{B}_{\mathrm{MBHB}/\mathrm{glitch}}\approx$1.56\text{\times}{10}^{% 7}$roman_log caligraphic_B start_POSTSUBSCRIPT roman_MBHB / roman_glitch end_POSTSUBSCRIPT ≈ start_ARG 1.56 end_ARG start_ARG times end_ARG start_ARG power start_ARG 10 end_ARG start_ARG 7 end_ARG end_ARG with dlogMBHB/glitch=28048.2𝑑subscriptMBHBglitch28048.2d\log\mathcal{B}_{\mathrm{MBHB}/\mathrm{glitch}}=28048.2italic_d roman_log caligraphic_B start_POSTSUBSCRIPT roman_MBHB / roman_glitch end_POSTSUBSCRIPT = 28048.2 for stretch N29. As before, dlogMBHB/glitch𝑑subscriptMBHBglitchd\log\mathcal{B}_{\mathrm{MBHB}/\mathrm{glitch}}italic_d roman_log caligraphic_B start_POSTSUBSCRIPT roman_MBHB / roman_glitch end_POSTSUBSCRIPT gives an estimate on the error on logMBHB/glitchsubscriptMBHBglitch\log\mathcal{B}_{\mathrm{MBHB}/\mathrm{glitch}}roman_log caligraphic_B start_POSTSUBSCRIPT roman_MBHB / roman_glitch end_POSTSUBSCRIPT.

References