Bayesian learning for accurate and robust biomolecular force fields

Vojtech Kostal Brennon L. Shanks Pavel Jungwirth [email protected] Hector Martinez-Seara [email protected] Institute of Organic Chemistry and Biochemistry of the Czech Academy of Sciences, Flemingovo nám. 2, 166 10 Prague 6, Czech Republic

(November 7, 2025)

Abstract

Molecular dynamics is a valuable tool to probe biological processes at the atomistic level — a resolution often elusive to experiments. However, the credibility of molecular models is limited by the accuracy of the underlying force field, which is often parametrized relying on ad hoc assumptions. To address this gap, we present a Bayesian framework for learning physically grounded parameters directly from ab initio molecular dynamics data. By representing both model parameters and data probabilistically, the framework yields interpretable, statistically rigorous models in which uncertainty and transferability emerge naturally from the learning process. This approach provides a transparent, data-driven foundation for developing predictive molecular models and enhances confidence in computational descriptions of biophysical systems. We demonstrate the method using 18 biologically relevant molecular fragments that capture key motifs in proteins, nucleic acids, and lipids, and, as a proof of concept, apply it to calcium binding to troponin – a central event in cardiac regulation.

Introduction

Molecular dynamics (MD) simulations represent a powerful tool for probing complex biological and other soft matter systems at atomistic resolution [1]. At this resolution scale, empirical fixed-charge force fields remain the most practical choice to access relevant time and length scales [2]. Their efficiency comes at the cost of simplifying non-bonded interactions to pairwise Coulomb terms and dispersion-repulsion contributions, typically modeled by the Lennard-Jones potential [3]. While these simplified potential forms enable large-scale simulations, they inherently neglect the explicit electronic structure. To exploit the full power of this approach for accurate modeling of condensed-phase behavior therefore requires careful force field parametrization.

Direct parameterization of an entire biological system based solely on experimental data is generally unfeasible, as the number of adjustable parameters far exceeds the number of available observables, leading to an under-determined optimization problem [4]. A widely adopted strategy to overcome this limitation is to decompose the system into smaller, chemically meaningful fragments which are parameterized independently and then recombined. Established biomolecular force fields are typically developed using a combination of experimental measurements and calculated quantum chemical reference data [5, 6, 7].

Among the various parameters of a force field, atomic partial charges play a particularly critical role. These charges constitute a coarse-grained representation of the underlying electronic structure and are not themselves experimental observables. Indeed, there exists a many-to-one mapping between possible partial charge distributions and a given measurable property, making their determination inherently non-unique. Most established force field development protocols start from gas-phase or partially hydrated quantum chemical calculations combined with population analyses or charge fitting to reproduce the electrostatic potential to generate initial charge assignments [3, 8]. These values can subsequently be refined to improve agreement with condensed-phase properties. Various strategies have been proposed to account for environmental influences, including effects of electronic polarization [9, 10], such as minimal explicit hydration [11, 12], continuum solvent models [13] or empirical scaling of charges [14, 15]. However, these ad hoc approaches remain limited in their ability to consistently mimic solvation effects.

Current state-of-the-art force field parametrization methods yield as a rule a single “best” parameter set, leaving little room for modifications without risking uncontrolled degradation of performance. Experience has taught us that some parameters are more sensitive than others, but this understanding has so far relied largely on trial and error, without systematic evaluation of the ranges within which parameters can be adjusted without compromising accuracy. Certain flexibility in parameter values can be valuable, for instance when introducing a non-standard residue that must be integrated with an existing structure, potentially creating a net charge that requires redistribution, or when fine-tuning of parameters is required to reproduce an experimental observable such as a ligand binding affinity. However, making such adjustments reliably is challenging, as they may inadvertently affect important properties of the system—such as hydration structures or hydrogen-bond networks—that are critical for biomolecular stability. In this light, a method that provides not only the optimal parameter set but also confidence intervals within which the force field can be safely modified would be highly desirable and beneficial to the community.

To depart from the traditional paradigm, we designed a Bayesian framework for force field parameterization that provides rigorous predictions of both optimal parameters and confidence intervals. We anchor the force field parameterization to ab initio molecular dynamics (AIMD) in explicit solvent, which allows us to naturally capture the effect of the environment without any ad hoc corrections for the gas-to-liquid transition. Moreover, we leave maximum flexibility by not binding the method to any specific model for water and ions, making it fully general and ready to be applied to any underlying model of choice.

Results

Overview of the Method

Here, we present a general and robust workflow for optimizing partial charge distributions in molecules within the framework of fixed-charge models. Unlike for neat solvents or simple ion solutions, experimental data directly applicable for parametrization are rarely available for biomolecular species, making it difficult to anchor parameter optimization to the closest ground truth data and its associated deviations. To circumvent this limitation, we focus on condensed-phase reference data obtained from AIMD simulations based on density functional theory, which have become more accessible due to improvements in computational methods and infrastructure. These simulations inherently capture many-body interactions, including electronic polarization, thus providing a physically realistic structural benchmark at an acceptable computational cost. By linking parameter optimization to high-level simulations of the target environment, we ensure that the resulting force field parameters accurately reproduce the condensed-phase behavior of the molecules within the accuracy limits imposed by the force field’s functional form and its quantum mechanical reference.

The core of our approach is a generalizable method for learning partial charge distributions of molecular fragments, readily extensible to other force field parameters through an automated data acquisition and accelerated Bayesian inference framework (Fig. 1). Our method learns classical force field parameters from AIMD simulations of solvated molecular fragments, thereby incorporating solvent-mediated polarization effects that are pervasive in condensed phase systems. In this approach, force field molecular dynamics (FFMD) simulations with trial parameters were used to generate quantities of interest (QoIs), such as radial distribution functions and hydrogen bond order (Fig. 11), which were then emulated by local Gaussian process (LGP) surrogate models (Fig. 12). The LGP surrogate predictions enabled efficient evaluation of the likelihood of candidate parameter sets against reference AIMD data, and when combined with prior information, the approach was used to learn the posterior distribution with Markov chain Monte Carlo, as schematically depicted in Figure 1. By iteratively sampling from this posterior, the method refined the parameter estimates until convergence, yielding charge distributions that balance accuracy and uncertainty inherent to the Bayesian formulation.

Refer to caption — Figure 1: Overview of the Bayesian inference workflow. The workflow begins with data acquisition (a): a set of $N$ randomized partial charge vectors $\bm{\theta}$ collected in matrix $\mathbf{X}$ are used to run FFMD simulations and the corresponding QoI outputs are stored in matrix $\mathbf{Y}$ . AIMD simulations provide reference trajectories and QoIs are extracted into vector $\bm{y}$ . The training output matrix $\mathbf{Y}$ is partitioned column-wise into sub-matrices $\mathbf{Y}^{(k)}$ corresponding to the $k$ -th QoI. In surrogate modeling (b), a separate LGP is trained for each QoI. Kernel hyperparameters are optimized to the FFMD dataset $\{\mathbf{X},\mathbf{Y^{(k)}}\}$ using leave-one-out marginal likelihood. The resulting inverse kernel matrix is precomputed to construct the final LGP surrogate. In parameter optimization (c), Bayes’ theorem combines a prior probability with the likelihood of observing the AIMD reference features $\bm{y}$ given predicted FFMD QoIs from the LGPs. Posterior sampling is performed using Markov Chain Monte Carlo (MCMC). In each iteration, new parameters $\bm{\theta}^{*}$ and QoI-specific nuisance parameters $\bm{n}^{*}$ are proposed, FFMD QoIs $\bm{y}(\bm{\theta}^{*})$ are predicted from the set of surrogates, and the likelihood is evaluated. Multiplying the likelihood by the prior yields the posterior value up to a normalization constant. The MCMC loop continues until the posterior distribution converges, yielding the optimized force field parameter distribution.

We emphasize that the computational tractability of our method hinges on the LGP regressor [16]. This surrogate predicts the structural QoIs from a given set of trial charges at a fraction of the time of an FFMD simulation, allowing for efficient evaluation of new candidates during Markov chain Monte Carlo sampling without the need to repeatedly run costly FFMD simulations. LGPs were favored over black-box machine learning tools, such as neural networks, since they can incorporate interpretable, physics-informed priors that effectively model structural properties in bulk liquids [17] and aqueous ions [18].

We demonstrate the strength and generality of our methodology by optimizing charge distributions for a set of biologically relevant molecular species ranging from neutral, cationic, to anionic motifs, including carboxylates, phosphates, sulfates, carbonyls, and positively charged nitrogen centers characteristic of protein sidechains, lipid headgroups, nucleic acids, and glycans. As a proof-of-principle, we then apply the acetate parameterization to model calcium binding to troponin — a key event regulating the heartbeat that is challenging to simulate due to the large number of charged groups and the calcium ion involved. Together, these results establish a systematic strategy for deriving fragment-based force field parameterizations that bridge electronic-structure accuracy and classical simulation efficiency, with broad applications in biomolecular modeling.

Parameterization of Partial Charge Distributions

We applied the Bayesian inference framework to refine partial charge distributions for 18 biologically important motifs, illustrated in Figure 2a. Starting from the CHARMM36 baseline, we learned the partial charges while leaving bonded and van der Waals parameters unchanged – a decision that affects the resulting optimized charges. Here, we applied the electronic continuum correction (ECC) as a baseline framework for charges that includes electronic polarization in a mean field manner [19, 20, 21, 15, 22]. We enforced a global constraint on the total molecular charge, scaled by a factor of 0.8. This scaling factor was adopted from our recent work, together with compatible models of water [23] and ions [18].

For each molecule, the training set comprised several thousand charge distributions sampled from a physically motivated truncated normal prior. Charge bounds were chosen based on ranges typical of established force fields and extended to allow broader, yet physically meaningful, variation. QoIs were extracted for each charge distribution from three independent simulation setups: (i) the aqueous solute, (ii) the solute with a counterion in direct contact, and (iii) the solute with a solvent-shared counterion. After training an LGP surrogate to map partial charges to the QoIs, the Markov chain Monte Carlo engine was used to sample the posterior distribution of partial charges. The accuracy of the QoIs relative to the AIMD reference was assessed using the normalized mean absolute error (NMAE) from ten posterior samples with different partial charge distributions.

Figure 2b summarizes the overall performance of the method. The optimized charges show consistent agreement with AIMD across all metrics. Hydration structure errors, characterized by radial distribution functions (RDFs), remain below 5% for most species, demonstrating robust reproduction of solvation structure. Hydrogen bond counts typically deviate by less than 10–20%, with larger variability arising from the rigid water model, which to a certain extent limits simultaneous reproduction of RDFs and hydrogen-bond statistics. Ion-pair distance distributions generally show deviations under 20%, with somewhat larger errors for some anionic species. The residual errors largely reflect the trade-off inherent in simultaneously reproducing all the QoIs. For comparison, Supplementary Figure S3 shows the performance obtained with parameters optimized individually for each QoI.

Relative improvements compared to the original CHARMM36-nbfix force field are quantified in the subpanels of Figure 2b by green and red bars. Systematic improvements are observed across nearly all species and QoIs, particularly for charged systems — most notably anions — where optimized charges restore more balanced electrostatics. Even neutral molecules benefit modestly. While only the mean trends are shown for clarity, individual posterior samples often achieve substantially greater improvements than reflected in the averages. A full distribution of $\Delta$ values, including variability and outliers, is provided in Supplementary Figure S5.

As the parametrization was performed for system sizes and concentrations accessible to AIMD, direct transferability to higher solute concentrations is not automatically guaranteed because explicit solute-solute interactions are missing in the training and reference data. To assess this potential effect, we validated our charge distributions against known aqueous solution densities using posterior charge samples, as shown in Figure 2c. For charged species, counterions (sodium, potassium or chloride) were selected based on available experimental data, and the corresponding concentration ranges extended up to their experimental solubilities. Across nearly all solutions, simulated densities deviated from experiment by less than 1%, even at solute concentrations approaching a 50% mass ratio. This agreement — despite no explicit training on solute–solute interactions — provides strong evidence for the robustness and transferability of the optimized charges.

Chemical Intuition - Learning from the Posterior

Building on the validation of optimized charges against AIMD and experimental data, we next examine the chemical insight provided by the Bayesian framework. Figures 3a-c present the posterior means and associated 95% confidence intervals for all atoms across the studied molecules. These intervals, known as 1D marginals of the partial charge joint probability distributions, naturally serve as a quantified form of chemical intuition by delineating ranges of charge values that preserve consistency with the reference AIMD data. In this way, the posterior charge distributions serve as a robust design space for further tuning and sensitivity analysis.

For example, atomic partial charges with narrow confidence intervals such as is the case for oxygen atoms (red-colored in Figure 3a) are tightly defined by the structural QoI. Thus, they play a critical role in reliably representing the reference data. In contrast, broader confidence intervals (e.g., for carbon or phosphorus atoms) indicate that these atoms have a less critical influence on the QoIs and can accommodate more variability without degrading agreement with the reference. This qualitative observation agrees with chemical intuition: atoms directly exposed to the environment are more important in reproducing solvation structure than those buried deeper within the molecule and shielded by other atoms.

Figure 3d offers a graph-based perspective of the optimized charges. The graph layout was generated using community detection and force-directed placement, providing both topological and chemical organization. Atoms are clustered based on chemical similarity into functional groups, with nodes representing atomtypes (element plus bonding environment) and edges indicating covalent connectivity. Node colors encode the most probable partial charge, spanning approximately $-0.8e$ to $+0.9e$ , a narrower range than in the CHARMM36-nbfix force field ( $-1.0e$ to $+1.5e$ ), in part due to the ECC scaling factor.

Despite optimizing charges independently for each molecule, clear chemical trends emerge. Strongly negative atom types correspond to oxygen or nitrogen atoms. Terminal oxygens in carboxylates and phosphates cluster between $-0.65e$ and $-0.55e$ , while sulfates are slightly less negative ( $\sim-0.4e$ ), reflecting charge delocalization over three equivalent atoms. Hydrogenphosphate oxygens reach about $-0.67e$ . Carbonyl oxygens span $-0.7e$ to $-0.4e$ , consistent with their terminal, non-resonance-stabilized positions, and hydroxyl oxygens are typically around $-0.7e$ , except in hydrogenphosphate ( $-0.4e$ ), likely due to nearby electronegative sites. Ether oxygens (bound to two heavy atoms) are less polarized ( $-0.3e$ to $-0.1e$ ). Nitrogens display a broad spread ( $-0.9e$ to $-0.4e$ ), except for quaternary nitrogen in tetramethylammonium, which can also hold positive values, reflecting strong electron withdrawing effects of methyl groups. Hydrogens bound to electronegative atoms (O or N) are significantly positive, while aliphatic hydrogens are weakly positive, with a few exceptions showing small negative values. Carbon atomtypes show the widest charge variation. Methyl carbons are weakly negative but can reach up to $-0.4e$ , methylene carbons are typically positive ( $+0.2e$ ), and carbonyl carbons are strongly positive, consistent with electron-withdrawing effects from the resonant oxygens. These trends closely align with chemical intuition, supporting the physical plausibility of the optimized charges.

Consistency across independent molecule-specific optimizations demonstrates the robustness of the Bayesian framework, which not only fits reference data but also captures generalizable chemical principles with rigorous uncertainty quantification. This makes the approach highly promising for transferable parameterization of molecular fragments in condensed-phase simulations, as demonstrated in the next section. Indeed, knowledge of the plausible confidence intervals can be used to incorporate small fragments into larger molecules or more complex environments, for instance, by indicating which atoms can absorb any excess of charge while minimizing perturbations to the QoIs. Finally, parameters with higher uncertainty can be identified and more aggressively refined by selecting new or additional reference data. This strategy not only guides future parameterization efforts but also provides critical insight into how the force field can be further improved.

Proof-of-Principle: Calcium Binding to Cardiac Troponin C

We leverage the fragment-level chemical intuition provided by the Bayesian posterior to demonstrate how our method can be successfully applied in a biologically relevant context. To this end, we evaluated the binding free energy of Ca²⁺ to the regulatory domain of human cardiac troponin C (N-cTnC) (Figure 4a), which is the primary Ca²⁺ sensor that regulates cardiac contraction.

To simulate the entire protein, a complete and self-consistent set of force field parameters was required. For N-cTnC, this corresponds to an inherently high-dimensional parameter learning problem, which we addressed using a fragment-based Bayesian approach. Keeping the protein backbone parameters fixed, all charged side chains — carboxymethyl, guanidinium, and ethylammonium — were parameterized within this framework to ensure internal consistency, thereby reducing the dimensionality of the parameter space.

Partial charges of the carboxymethyl side chains, the principle charged motif of the Asp and Glu residues in the EF-hand loop of N-cTnC (Figure 4a–c), were modeled using samples drawn from the posterior distribution of acetate (Figure 4d–e). The positively charged side chain of arginine was parameterized using guanidinium, while the lysine side chain was derived from ethylammonium. For a schematic breakdown of the amino acid side chains, see Supplementary Figure S6. A uniform charge scaling factor of 0.8, consistent with the ECC framework [18], was applied to all charged residues and incorporated directly during parameterization. To maintain overall protein electroneutrality, a small compensating charge (up to $\pm 0.1e$ ) was added to or subtracted from the atom with the largest posterior marginal uncertainty, typically a buried carbon atom.

Distinct sets of partial charges, including the maximum a posteriori (MAP), were sampled from each fragment’s posterior distribution, yielding multiple N-cTnC force field parametrizations (Figure 4e and Supplementary Table S1). Binding free energies of Ca²⁺ to N-cTnC obtained with the MAP and ten different partial charge sets are shown in Figure 4d as bullets. We see a correlation between $\Delta$ G ${}_{\text{bind}}$ and partial charge of the carboxyl oxygen, which is the leading direct binding partner of Ca²⁺ in the EF-hand loop. The predicted values are in good agreement with the experimental value of $-28.6$ kJ/mol [24], deviating by less than 10 kJ/mol, with certain parameter sets achieving near-quantitative accuracy of $\sim$ 1 kJ/mol.

Some samples, however, exhibit minor over- or underbinding. Specifically, the MAP parameterization underbinds by $\sim 6$ kJ/mol. This indicates that the partial charge distribution that best represents the acetate ion in bulk water is not necessarily optimal in this biological context. This observation is not surprising, as the acetate moiety is not identical to the carboxymethyl pattern found in amino acid sidechains. More broadly, this exemplifies a central limitation of fixed-charge force field optimizers; namely, their inability to adapt seamlessly to new chemical contexts. This observation thus underscores the importance of the Bayesian approach: by capturing parameter uncertainties, we obtain a more complete and nuanced view of the model’s predictive capabilities.

For comparison, we also evaluated the binding free energies using several established force fields — CHARMM36/CHARMM36-nbfix [25], and ProsECCo75 [26] — shown as triangles in Figure X. CHARMM36 yields a substantially overestimated binding free energy of $-105.7$ kJ mol^-1, primarily due to its neglect of electronic polarization. ProsECCo75, which is a CHARMM36 derivative that approximates polarization effects in a mean-field manner through the ECC scheme, instead shows pronounced underbinding of $-13.5$ kJ/mol. The CHARMM36-nbfix variant yields a value of $-31.7$ kJ/mol which is close to the experiment. However, this agreement arises from compensation of errors, namely from empirical, pair-specific adjustments to van der Waals radii to counteract overly strong ion–ion electrostatic interactions. Interestingly, we observe here a correlation between accuracy and dipole moment of the CH₂COO^- motif: sets with dipole moments of 2.7-4.4 D around the most probable oxygen charge region ( $\sim$ 0.57) best reproduce the experimental binding free energy, as illustrated in Figure 4g. These dipole moments are markedly smaller than CHARMM36/CHARMM36-nbfix (8.9 D) and prosECCo75 (6.6 D).

I Discussion

Empirical force field design for biomolecules is often as much art as science. The high-dimensional parameter landscapes, sensitivity to loss-function definitions, and dependence on available reference data can lead to markedly different models, often without clear justification or enough information to declare a clear winner. This lack of transparency has been increasingly recognized in the biomolecular modeling community [27] and has prompted calls for rigorous assessments of both uncertainty and bias [5].

Bayesian inference offers a principled route to address these concerns, providing a gold-standard framework for quantifying uncertainty given limited information [28, 29]. Its primary drawback is computational: the curse of dimensionality and the cost of repeated evaluations often render Bayesian methods impractical for complex, high-dimensional tasks. Consequently, the adoption of Bayesian inference in force field development has been limited, with prohibitive computational costs often outweighing the clear benefits of rigor and interpretability.

In the spirit of transparency and reproducibility, we combined machine learning accelerated Bayesian inference with GPU computing to create an efficient, transferable, and physically grounded tool for learning fixed-charge force field parameters directly from condensed-phase AIMD reference data. This approach enables parameter inference at scales and system complexities beyond what was previously feasible. As an example presented here, we extended Bayesian learning of classical force fields from bulk liquids-argon [30], neon [17], and water [31]-to 18 biologically relevant molecular fragments in water, increasing parameter dimensionality from 3–5 in prior studies up to 10. While the focus of this paper is on the optimization of charges in an explicit-solvation environment, the same methodology is also directly applicable to other force field parameters. As a matter of fact, exploratory work optimizing Lennard-Jones non-bonded parameters within the same framework points to additional improvements in accuracy in aqueous solutions.

An essential aspect of our parametrization strategy is its transferability from small molecular fragments in simple environments to more complex and biologically relevant systems. We first demonstrated this by reproducing the experimental densities of aqueous electrolyte solutions with charges derived from minimal fragments, achieving errors below 1% in most cases. Next, we extended the same parametrization to the charged residues of the regulatory domain of cardiac troponin C and calculated the free energy of Ca²⁺ binding to its EF-hand loop. The close agreement with experimental binding data provides strong evidence that parameters optimized on isolated fragments remain predictive when applied to a protein environment that differs substantially from the training set. Moreover, by sampling directly from the posterior distribution, we were able to assess the robustness of alternative parameter sets without further refinements. Our approach does not yield a single fixed solution but rather a family of statistically consistent models, leaving users the flexibility to incorporate additional criteria into their choice of parameters. Together, these results underscore the robustness and transferability of the approach across both condensed-phase solution properties and ion–protein interactions.

The generality of our framework makes it applicable to a wide range of molecular modeling challenges. Because any observable accessible from simulation can be incorporated into the likelihood, the same methodology can unify diverse data types, from ab initio calculations to experimental measurements such as thermodynamic properties and neutron/X-ray scattering patterns. Our method therefore opens the possibility of systematically integrating experimental data with simulation-based models while preserving rigorous uncertainty quantification. In this sense, the approach provides a bridge between physics-based modeling and modern data-driven methodologies, offering a common Bayesian language for reconciling calculations and experiment. Importantly, the Bayesian framework limits bias from hand-tuned parameters and makes the impact of modeling choices explicit, transparent, and measurable. Our method thus offers a principled framework for rigorously evaluating modeling assumptions in relation to target observables.

While our approach substantially broadens the applicability of Bayesian inference to force fields, it is not without limitations. The curse of dimensionality poses challenges for scaling to very large biomolecular systems, which may involve hundreds or thousands of parameters, even when leveraging GPU acceleration and machine learning surrogate models. As a result, practical applications to large-scale force field reparameterization will likely require hybrid strategies that integrate our fragment-based training with targeted refinements for system-specific properties. A further limitation is that the posterior distribution remains conditional on the modeling choices used to construct the likelihood, priors, and surrogate models. If these ingredients are mis-specified, for example due to approximations in surrogate accuracy or oversimplified physical assumptions, the resulting uncertainty estimates may underestimate or overestimate the true parametric uncertainty. In addition, the quality of inference is inherently tied to the availability and representativeness of the reference data. Sparse, noisy, or biased observables can lead to skewed posterior distributions, particularly as the dimensionality of the parameter space increases.

Despite these limitations, the significant advances demonstrated with our method suggest a new paradigm for force field development: one in which parameter optimization is no longer an art, but rather a transparent and quantitative science. By leveraging GPU acceleration and surrogate modeling, we make parameter inference not only rigorous but also practical at scales relevant to biomolecular research. The release of our methodology as an open and extensible software package democratizes access to uncertainty-aware parameterization, lowering the barrier for broad adoption and addressing the communities calls for transparency and reproducibility. We envision that this framework will help establish uncertainty quantification as a standard practice in molecular simulations, fundamentally reshaping how the community develops, validates, and applies biomolecular force fields.

II Methods

This section outlines the computational framework developed for refining partial atomic charges in classical force fields using structural observables obtained from AIMD simulations. We begin by introducing the key principles of Bayesian inference that underpin the inference process (Fig 1). The subsequent sections describe the full optimization workflow, including a detailed description of the surrogate modeling approach based on LGPs, which enables efficient approximation of classical MD outcomes. Eventually, we provide details of both the DFT- and FFMD simulations used throughout this study.

II.1 Inference Workflow

The workflow for optimization of partial charges comprises of four steps depicted in Figure 1: (a) data acquisition, (b) surrogate model training, (c) Bayesian inference, and (d) optional validation (not shown in the figure). The goal is to align structural QoI from FFMD with reference AIMD. While extension to van der Waals parameters is straightforward and implemented in the code, this study focuses on charge refinement.

Data acquisition

Initial systems (i.e., molecules in aqueous solution) are prepared using user-supplied topologies and equilibrated with FFMD. The equilibrated snapshots seed AIMD simulations. In parallel, a diverse FFMD training dataset is generated by sampling the force field parameters $\bm{\theta}$ via Latin Hypercube Sampling (LHS) implemented in SciPy [32]. The parameter samples are stored in matrix $\mathbf{X}=\left[\bm{\theta}_{1},\bm{\theta}_{2},\dots,\bm{\theta}_{N}\right]^{\top}\in\mathbb{R}^{N\times D}$ , where $N$ is the number of samples and $D=\dim(\bm{\theta})$ . The sampling is limited to chemically sound regions of the parameter space. To maintain the total molecular charge, one atom type per molecule is designated as implicit, with its charge calculated by subtracting the sum of all sampled charges from the target total molecular charge.

QoIs are computed from three simulations at both FFMD and AIMD level: solute in water, solute with a restrained nearby ion, and solute with a restrained distant ion. Structural descriptors (QoI) computed include:

1.

Partial radial distribution functions (RDF)s: Between solute atoms and water oxygens.
2.

Hydrogen bond statistics: Based on O/N/S acceptor atoms using geometric criteria of the donor-acceptor distance $<3.5$ Å and donor–hydrogen–acceptor angle $>150^{\circ}$ .
3.

Ion–solute distance probability density: For restrained ion placements for trajectories containing extra ion.

The FFMD QoIs are then stored in a block-concatenated matrix $\mathbf{Y}=\left[\mathbf{Y}^{(1)}\mathbf{Y}^{(2)}\dots\mathbf{Y}^{(M)}\right]$ , where $\mathbf{Y}^{(k)}\in\mathbb{R}^{N\times\eta_{k}}$ contains $k$ -th QoI evaluated at $\eta_{k}$ grid points. Similarly, the AIMD QoIs are stored in the vector $\bm{y}$ . FFMD matrices $\mathbf{X}$ and $\mathbf{Y}$ are used as inputs and outputs in the subsequent surrogate modeling.

Surrogate modeling

A central challenge in our Bayesian inference framework is the need to evaluate the likelihood function for each parameter set proposed by the Markov chain Monte Carlo sampler. This evaluation, in principle, requires generating and analyzing a full FFMD trajectory for every candidate set of partial charges $\bm{\theta}$ -a computationally prohibitive task given the tens to hundreds of thousands of such evaluations required for posterior convergence.

Local Gaussian Processes: To overcome this bottleneck, we introduce a surrogate model that maps $\bm{\theta}$ directly to structural QoIs derived from FFMD simulations. This surrogate model replaces the expensive FFMD simulations during inference while retaining predictive accuracy. We construct this surrogate using an LGP regressor [16], which is an efficient approximation to a Gaussian process (GP) that decomposes high-dimensional input space into a subset of independent one-dimensional GPs. Each QoI (e.g., RDFs or scalar descriptors) is predicted component-wise, with each dimension modeled by a separate GP. For a structural quantity such as an RDF, this approximation amounts to learning a set of RDF values $\{g(r_{i})|i\in 1,\dots,\eta_{k}\}$ at radial inducing points $r_{1},\dots,r_{\eta_{k}}$ , $g(r_{i})\sim GP_{i}$ , as opposed to modeling the full RDF as a single GP, $g(\bm{r})\sim GP$ .

For a trial input $\bm{\theta}$ , the zero mean LGP expectation value of the $k$ -th QoI is given by a column vector of the LGP expectation value at each inducing point $i\in 1,\dots,\eta_{k}$ :

\widehat{\bm{y}}_{k}(\bm{\theta})\approx\text{LGP}_{k}(\bm{\theta})=\mathbf{K}_{\bm{\theta},\mathbf{X}}^{(k)}\left[\mathbf{K}_{\mathbf{X},\mathbf{X}}^{(k)}+\sigma_{k}^{2}\mathbf{I}\right]^{-1}\mathbf{Y}^{(k)}

(1)

where $\sigma_{k}^{2}$ is the noise variance hyperparameter, and $\mathbf{I}$ is the identity matrix. This yields a vector prediction $\widehat{\bm{y}}_{k}(\bm{\theta})\in\mathbb{R}^{\eta_{k}}$ .

To solve Eq. (1), we must specify a LGP prior to specify the kernel (or Gram) matrix $\mathbf{K}$ in the form of a kernel function, $K$ . This function defines the underlying properties of samples drawn from the LGP and can be tailored to encode generic function properties such as smoothness, continuity, periodicity, and others [33]. Here, we choose the squared-exponential (radial basis function) kernel to construct the elements of $\mathbf{K}$ between arbitrary sample indices $i$ and $j$ :

K_{ij}^{(k)}=\alpha_{k}^{2}\exp\left[-\frac{1}{2}\sum_{d=1}^{D}\left(\frac{\theta_{d,i}-\theta_{d,j}}{\ell_{k,d}}\right)^{2}\right],

(2)

where $\alpha^{2}$ is the kernel variance which controls the vertical scale of the prediction, and $l_{d}$ are length scales that control how quickly the correlation between function values decay with distance. The set of variables $\bm{\zeta_{k}}=\{\sigma_{k},\alpha_{k},\ell_{k,1},\dots,\ell_{k,D}\}$ are known as kernel hyperparameters. The squared-exponential kernel ensures that the surrogate predictions are smooth and differentiable with respect to the force field parameters, a property expected of ensemble structural features.

To improve surrogate accuracy for RDFs, we first subtract a physically motivated sigmoid baseline from the RDF training data of the form:

s(x)=\frac{1}{1+e^{-a(x-x_{0})}},

(3)

with $x_{0}=3$ Å and $a=5$ Å^-1. This preprocessing step ensures the surrogate only learns deviations from the expected asymptotic RDF behavior and is equivalent to using Equation 3 as the LGP prior mean [34].

Hyperparameter Learning: The kernel hyperparameters for the $k$ -th QoI, $\bm{\zeta_{k}}$ , must be learned from the training data to ensure the accuracy and transferability of the LGP model. Here we construct a Bayesian hyperposterior from independent log-normal priors $\sim\mathcal{N}(-2,2)$ over each parameter and a leave-one-out (LOO) log marginal likelihood (cf. Equation 8 in Reference [35]):

	$\displaystyle\log p_{\mathrm{LOO}}(\mathbf{Y}^{(k)}\|\bm{\zeta}_{k})=$	$\displaystyle-\frac{1}{2N^{\prime}\sigma_{k}^{2}}\sum_{i=1}^{N^{\prime}}\frac{z^{2}(i)}{c_{ii}}$		(4)
		$\displaystyle+\frac{\eta_{k}}{2N^{\prime}}\sum_{i=1}^{N}\log c_{ii}-\frac{\eta_{k}}{2}\log(2\pi),$		(5)

where $\mathbf{z}=\mathbf{C}^{-1}\mathbf{Y}^{(k)}$ , $\mathbf{C}=\mathbf{K}^{(k)}_{\mathbf{X},\mathbf{X}}+\sigma_{k}^{2}\mathbf{I}$ , and $c_{ii}$ denotes the $i$ -th diagonal element of $\mathbf{C}^{-1}$ . Here, $N^{\prime}$ is the number of training samples determined from an 80-20 train-test split of the full FFMD training set. Minimization of the negative log hyperposterior to find the maximum a posteriori is performed with stochastic gradient descent within the automatic differentiation pipeline in PyTorch [36]. Performance and benchmarks of the LGP surrogates applied in this study are provided in Supplementary Section S2.

In some applications, it is important to propagate uncertainty in the surrogate model predictions by marginalizing over the hyperparameter distributions of the hyperposterior. If required, a Laplace approximation of the hyperposterior can be constructed as a multivariate normal distribution $p(\bm{\zeta}|\mathbf{y})\sim\mathcal{N}(\bm{\zeta}_{\text{MAP}},\Sigma_{\text{hyper}})$ where $\Sigma_{\text{hyper}}$ is the inverse of the negative Hessian (which captures the curvature of the log-posterior around its mode). Sampling from this Laplace approximation provides an efficient means to marginalize over hyperparameter uncertainty when needed. For our purposes, however, such marginalization was found to be unnecessary. Results from a comparative test between a single LGP prediction and a 100 member committee average showed a negligible difference in the resulting force field parameter posteriors (Supplementary Figure S4).

Bayesian Inference of Force Field Parameters

Bayesian inference is a rigorous probabilistic approach to updating prior beliefs based on observed data. Its true power lies in the integration of physics-based expert priors to constrain model complexity, which naturally prevents overfitting while preserving principled uncertainty propagation through the parameter inference process.

According to Bayes’ theorem, the posterior distribution of the model parameters $\bm{\theta}$ and nuisance parameters $\bm{n}$ ) given observed data $\bm{y}$ can be expressed as:

p(\bm{\theta},\bm{n}|\bm{y})=\frac{p(\bm{y}|\bm{\theta},\bm{n})p(\bm{\theta},\bm{n})}{p(\bm{y})},

(6)

where $p(\bm{\theta},\bm{n})$ is the joint prior reflecting our knowledge of $\bm{\theta}$ and $\bm{n}$ before observing the data, $p(\bm{y}|\bm{\theta},\bm{n})$ is the likelihood of the data given the parameters, and the denominator $p(\bm{y})$ is the marginal likelihood (or model evidence), which serves as a normalization constant. We assume that the model parameters and nuisance parameters are independent in the prior so that $p(\bm{\theta},\bm{n})=p(\bm{\theta})p(\bm{n})$ .

Prior: Each force field parameter $\bm{\theta}$ was assigned a weakly informative normal prior centered at the midpoint of the parameter range specified in data acquisition, with standard deviation set to $1/5$ of the range width. This prior construction discourages sampling near boundaries where the surrogate model may be less reliable due to extrapolation error from edge effects. One nuisance parameter per QoI is introduced to represent the combined effect of LGP uncertainty and observational variance in the likelihood model. These are modeled as log-normally distributed (to avoid sampling negative variances) with $\log(\sigma)\sim\mathcal{N}(-2,2)$ . We tested sensitivity to prior specification by comparing the normal priors to uniform (top-hat) priors spanning the same domain; posterior means and predictive distributions were essentially unchanged (Supplementary Figure S2), indicating data-dominant posteriors for identified parameters.

Likelihood: We select a likelihood that assumes (i) QoIs are conditionally independent given the model parameters and (ii) residuals between predictions $\hat{\mathbf{y}}^{(k)}(\bm{\theta})$ and observations $\mathbf{y}^{(k)}$ are independent, homoskedastic, and Gaussian over the independent variables of each QoI:

p(\bm{y}|\bm{\theta},\bm{n})=\prod_{k=1}^{K}\frac{1}{\left(2\pi n_{k}^{2}\right)^{n^{\mathrm{obs}}_{k}/2}}\exp\left(-\frac{1}{2n_{k}^{2}}\left\|\hat{\mathbf{y}}_{k}(\bm{\theta})-\mathbf{y}_{k}\right\|^{2}\right)

(7)

where, for a given QoI $k$ , there are $n^{\mathrm{obs}}$ independent observations and a nuisance parameter $n_{k}$ . In our implementation, we set the number of observations as one for each of the measured RDFs, hydrogen bond counts, and restrained distance distributions. Assuming conditional independence in the likelihood greatly reduces the computational cost but neglects potential multivariate dependencies - for example, correlations between RDFs and hydrogen bond network metrics. Likewise, the homoskedastic noise assumption reduces the number of inferred parameters but cannot capture known limiting behaviors of certain QoIs, such as the vanishing variance in RDFs at short range. More rigorous likelihoods that incorporate learned posterior covariances from reference data (cf. Reference 34) could be employed to relax these assumptions.

Posterior: We estimated the posterior distribution over $\bm{\theta}$ and $\bm{n}$ via Markov chain Monte Carlo sampling. The number of walkers was set to $5\times$ the number of sampled dimensions (including nuisance parameters). Initial walkers are drawn to satisfy bounds of the individual parameters as well as the total molecular charge constraint. Any $\bm{\theta}$ violating parameter bounds or charge conservation constraint was assigned $p(\bm{\theta})=0$ and effectively rejected. Walker moves were performed with the StretchMove algorithm with a maximum of 100,000 iterations. Convergence was monitored using the integrated autocorrelation time $\tau$ , with a minimum chain length of $100\tau$ and relative tolerance of $0.01$ as implemented in the emcee package [37, 38]. The resulting chain was post processed by discarding the first $2\tau_{max}$ as a warm-up (or burn-in) and thinning the chain by an integer factor of $0.5\tau_{min}$ , where $\tau_{max}$ and $\tau_{min}$ are the maximum and minimum integrated autocorrelation time over all inferred parameters, respectively. The resulting posterior distribution delineates a clear and interpretable parameter landscape that enables estimation of optimal force field parameters and their associated uncertainties for a given system.

Validation: After the posterior distribution converged, ten charge parameter vectors $\bm{\theta}$ were sampled from a multivariate normal (Laplace) approximation of the posterior. These samples were used to perform new FFMD simulations, following the same protocol as for the original training data acquisition. The resulting QoI were then compared to reference AIMD data to evaluate the success of the learning procedure.

Agreement was quantified by the normalized mean absolute error (NMAE):

\mathrm{NMAE}_{k}=\frac{\|\bm{y}_{k}-\widehat{\bm{y}}_{k}\|_{1}}{\|\bm{y}_{k}\|_{1}+\|\widehat{\bm{y}}_{k}\|_{1}},

(8)

where $\|\cdot\|_{1}:=\sum_{i=1}^{\eta_{k}}|\cdot_{i}|$ denotes the L¹ norm. By construction, $\mathrm{NMAE}=0$ indicates perfect agreement and $\mathrm{NMAE}=1$ indicates complete disagreement. For clarity, we report NMAE as a percentage.

II.2 Ab initio molecular dynamics

Ab initio molecular dynamics (AIMD) simulations were conducted using the CP2K 9.1 [39] software package, using the QUICKSTEP module for density functional theory (DFT) calculations. The electronic ground-state energies and forces were described under the Born–Oppenheimer approximation using the revPBE [40, 41] exchange-correlation functional, incorporating the D3 dispersion correction [42], with exclusions applied to Ca²⁺-containing pairs [43]. Kohn-Sham orbitals were represented using Gaussian TZV2P basis sets [44] in conjunction with Godecker-Hutter-Tetter (GTH) pseudopotentials [45] for core electrons, and the electronic density was expanded in plane waves with a 400 Ry cutoff. The simulations were performed in the NVT ensemble at 300 K. An initial 5 ps equilibration phase was carried out using a Langevin thermostat with a friction coefficient of 0.02. Subsequently, the production runs employed a stochastic velocity rescaling thermostat (SVR) [46] with a time constant of 1 ps. The equations of motion were integrated using the velocity Verlet algorithm with a time step of 0.5 fs. An overview of the composition and simulation details for species is provided in Supplementary Table S3.

II.3 Force field molecular dynamics

All the FFMD simulations were carried out in Gromacs 2024.3 [47].

training set simulations: The molecular force fields were taken as from CHARMM-GUI [48]. Each species was initially solvated in a cubic box of roughly 1.6 nm containing 128 TIP4P/2005 water molecules [49], optionally with counterions added. The system was first energy-minimized using the steepest descent algorithm. Subsequently, a 10 ns NpT equilibration was performed (discarding the first 1 ns) to determine the average box size. Pressure was maintained at 1 bar using the C-rescale [50] barostat with a time constant of 1 ps and a compressibility of $4.5\cdot 10^{-5}$ bar^-1. Temperature was maintained at 300 K using SVR thermostat with a 1 ps time constant, followed by a 1 ns NVT equilibration using the same thermostat. Hydrogen bond constraints were applied via the LINCS [51] algorithm. Nonbonded interactions were computed using a cutoff scheme with a 0.7 nm cutoff and long-range van der Waals and Coulombic interactions were treated using particle mesh Ewald (PME) [52] method with a potential-shift modifier. Such equilibrated structures were then used as initial conditions for both the training set generation and the reference AIMD simulations.

density simulations: For bulk solution density calculations, cubic boxes (4.5nm per side) containing 2000 water molecules and the appropriate number of solutes were constructed to match target concentrations. Each system underwent steepest-descent minimization, 200ps of NpT equilibration, and a 1.2ns production run in the NpT ensemble at 300K and 1bar. The SVR thermostat (1ps) and C-rescale barostat (5 ps, $4.5\times 10^{-5}$ bar^-1) were used for temperature and pressure control, respectively. Nonbonded cutoffs were set to 1.2nm, with all other parameters identical to those in the training set simulations.

Experimental densities were obtained from Reference 53 for aqueous sodium acetate, ethanol, and ammonium chloride; from Reference 54 for potassium hydrogenphosphate, potassium dihydrogenphosphate, and potassium formate; from Reference 55 for guanidinium chloride; and from Reference 56 for N-acetamide. To compare simulated and experimental data at matching concentrations, experimental density data were fitted with a second-order polynomial.

Troponin simulations: Initial coordinates for N-cTnC were taken from the crystal structure (PDB ID: 1AP4 [57], neutralized with K⁺ ions, and solvated in a 5.25nm cubic box containing 150mM KCl using CHARMM-GUI [48]. Each system underwent energy minimization, followed by 5ns NpT equilibration.

For simulations involving charge sets developed here, water and ions compatible with the ECC were employed [23, 18]. Nonbonded interactions were computed the same way as in the training set simulations with a 1.2 nm cutoff. In the case of CHARMM36, CHARMM36-nbfix and ProsECCo75, long-range electrostatic interactions were computed using the PME method while vdW interactions were truncated and smoothened by a force-switch modifier.

All the simulations under the NpT ensemble were performed at 310 K maintained by the SVR thermostat with 1 ps time constant and pressure was controlled by C-rescale barostat at 1 bar using 5 ps time constant and isotropic compressibility of $4.5\times 10^{-5}$ bar^-1 bar^-1.

II.4 Alchemical binding free energy calculation

The binding free energy of a Ca²⁺ ion to the EF-hand loop of troponin was calculated using the alchemical double decoupling technique [58, 59], as illustrated in the thermodynamic cycle in Figure 5. The binding site was defined by the distances between Ca²⁺ and its coordinating atoms for each force field (see Supplementary Figure S7). The equilibrium geometry of the site was obtained from a 100ns NpT simulation. Harmonic flat-bottom restraints were applied between Ca²⁺ and its coordinating atoms to prevent ion dissociation during the decoupling process. Because the binding site remained stable throughout the simulation, these restraints were assumed to contribute no free energy in the fully coupled state (stage I), as they did not perturb the natural fluctuations of the bound complex. The corresponding equilibrium geometries and restraint parameters are listed in Supplementary Table S2. Two independent alchemical transformations were carried out. In the first, the ligand was gradually decoupled from its environment within the binding site of the protein (stage II); in the second, the ligand was decoupled from bulk solvent (stage V). Each transformation comprised 21 $\lambda$ -windows. During decoupling, electrostatic interactions were turned off first over 11 windows, followed by van der Waals interactions over 10 windows, with $\lambda$ incremented in steps of $0.1$ . Each window was simulated for 100 ns during stage I and 10 ns during stage V.

Free energy differences between adjacent windows were computed using the Bennett acceptance ratio (BAR) as implemented in Gromacs using gmx bar command while discarding the first 1 ns for equilibration. Both bulk and site decoupling calculations were corrected for the neutralizing background charge introduced by the PME when the net system charge changed during Coulombic decoupling (Equation 17 in Referece 60):

\Delta G^{\mathrm{pbc}}=\frac{|\zeta|(q_{\mathrm{out}}^{2}-q_{\mathrm{in}}^{2})}{2\varepsilon L},

(9)

where $\zeta=-2.837$ is a constant, $\varepsilon=80$ is the is the dielectric constant of the solvent, $L$ is the length of the simulation box, and $q_{\mathrm{in}}$ , $q_{\mathrm{out}}$ are the net charges of the initial and final states, respectively.

To prevent the ion to escape from the binding pocket during decoupling, harmonic flat-bottom distance restraints were applied between the ligand and coordinating atoms. It was assumed that the introduction of restraints contributes no free energy in the fully coupled state (stage I), as the binding site is stable and the restraint does not perturb natural ligand fluctuations. The free energy associated with releasing the restraints in the decoupled state (stage III) was evaluated analytically for a single flat-bottom potential and by thermodynamic integration for the rest. The restraint correction accounts for the reduction of configurational volume accessible to the ligand:

\Delta G_{\mathrm{restraint}}=-RT\ln\left(\frac{V}{V^{0}}\right),

(10)

where $V^{0}$ is the standard volume occupied by one molecule at concentration of $1~\mathrm{mol}/\mathrm{dm}^{3}$ and $V$ is the effective binding site volume defined by the restraint potential $U(r)$ :

V_{0}=\frac{1\mathrm{dm}^{3}}{N_{A}}=\frac{10^{24}\mathrm{nm}^{3}}{6.022\cdot 10^{23}}=1.661\mathrm{nm}^{3}

(11)

V=\int_{0}^{\infty}\mathrm{d}r\ 4\pi r^{2}e^{-\beta U}

(12)

For a single flat-bottom restraint the integral in Equation 12 is analytical; for additional restraints the free energy from Equation 10 was estimated using thermodynamic integration by gradually reducing the restraint force constant according to $k_{\lambda}=k_{0}+\lambda^{\alpha}(k_{1}-k_{0})$ , with $\alpha=4$ chosen to ensure smooth convergence near $k_{\lambda}=0$ . Each $\lambda$ -window was simulated for 22 ns, discarding 2 ns due to equilibration. All restraints were implemented and removed using the Colvars module [61, 62]. A second assumption is made that the free energy cost of partitioning the system into two independent simulation boxes is zero once the ligand is fully decoupled and no restraints remain (stage IV in Figure 5).

The overall standard binding free energy was then obtained as:

\Delta G_{\mathrm{bind}}=\Delta G_{\mathrm{bulk}}+\Delta G_{\mathrm{site}}+\Delta G_{\mathrm{restraint}}

(13)

Data Availability

Open source code will be uploaded pending revisions, and may be available from the authors upon reasonable request.

Acknowledgements

P.J. acknowledges support from an ERC Advanced Grant (grant agreement no. 101095957). V.K. acknowledges support from the Charles University where he is enrolled as a PhD. student and from the IMPRS-QDC Dresden. We thank Elise Duboué-Dijon and Jérôme Hénin for help with the alchemical decoupling setup.

References

Dror et al. [2012] R. O. Dror, R. M. Dirks, J. P. Grossman, H. Xu, and D. E. Shaw, Annual Review of Biophysics 41, 429 (2012).
Dauber-Osguthorpe and Hagler [2019] P. Dauber-Osguthorpe and A. T. Hagler, Journal of Computer-Aided Molecular Design 33, 133 (2019).
Riniker [2018] S. Riniker, Journal of Chemical Information and Modeling 58, 565 (2018).
Polêto and Lemkul [2022] M. D. Polêto and J. A. Lemkul, Communications Chemistry 5, 38 (2022).
van der Spoel [2021] D. van der Spoel, Current Opinion in Structural Biology Theory and Simulation/Computational Methods ⚫ Macromolecular Assemblies, 67, 18 (2021).
Nerenberg and Head-Gordon [2018] P. S. Nerenberg and T. Head-Gordon, Current Opinion in Structural Biology Theory and Simulation $\bullet$ Macromolecular Assemblies, 49, 129 (2018).
Chipot [2024] C. Chipot, The Journal of Physical Chemistry B 128, 12023 (2024).
Bayly et al. [1993] C. I. Bayly, P. Cieplak, W. Cornell, and P. A. Kollman, The Journal of Physical Chemistry 97, 10269 (1993).
Leontyev and Stuchebrukhov [2014] I. V. Leontyev and A. A. Stuchebrukhov, The Journal of Chemical Physics 141, 014103 (2014).
Jorge [2024] M. Jorge, The Journal of Chemical Physics 161, 180901 (2024).
Vanommeslaeghe et al. [2010] K. Vanommeslaeghe, E. Hatcher, C. Acharya, S. Kundu, S. Zhong, J. Shim, E. Darian, O. Guvench, P. Lopes, I. Vorobyov, and A. D. Mackerell Jr., Journal of Computational Chemistry 31, 671 (2010).
Zhu et al. [2012] X. Zhu, P. E. Lopes, and A. D. MacKerell, Wiley interdisciplinary reviews. Computational molecular science 2, 167 (2012).
Duan et al. [2003] Y. Duan, C. Wu, S. Chowdhury, M. C. Lee, G. Xiong, W. Zhang, R. Yang, P. Cieplak, R. Luo, T. Lee, J. Caldwell, J. Wang, and P. Kollman, Journal of Computational Chemistry 24, 1999 (2003).
Cerutti et al. [2013] D. S. Cerutti, J. E. Rice, W. C. Swope, and D. A. Case, The Journal of Physical Chemistry B 117, 2328 (2013).
Duboué-Dijon et al. [2020] E. Duboué-Dijon, M. Javanainen, P. Delcroix, P. Jungwirth, and H. Martinez-Seara, The Journal of Chemical Physics 153, 050901 (2020).
Shanks et al. [2024a] B. L. Shanks, H. W. Sullivan, A. R. Shazed, and M. P. Hoepfner, Journal of Chemical Theory and Computation 20, 3798 (2024a).
Shanks et al. [2024b] B. L. Shanks, H. W. Sullivan, and M. P. Hoepfner, The Journal of Physical Chemistry Letters 15, 12608 (2024b).
Fan et al. [2025] S. Fan, P. E. Mason, V. C. Chamorro, B. L. Shanks, H. Martinez-Seara, and P. Jungwirth, Journal of Chemical Theory and Computation 21, 9023 (2025).
Leontyev and Stuchebrukhov [2010] I. V. Leontyev and A. A. Stuchebrukhov, Journal of Chemical Theory and Computation 6, 1498 (2010).
Leontyev and Stuchebrukhov [2011] I. Leontyev and A. Stuchebrukhov, Physical Chemistry Chemical Physics 13, 2613 (2011).
Kirby and Jungwirth [2019] B. J. Kirby and P. Jungwirth, The Journal of Physical Chemistry Letters 10, 7531 (2019).
Kostal et al. [2023a] V. Kostal, P. Jungwirth, and H. Martinez-Seara, The Journal of Physical Chemistry Letters 14, 8691 (2023a).
Cruces Chamorro et al. [2024] V. Cruces Chamorro, P. Jungwirth, and H. Martinez-Seara, The Journal of Physical Chemistry Letters 15, 2922 (2024).
Rayani et al. [2021] K. Rayani, J. Seffernick, A. Y. Li, J. P. Davis, A. M. Spuches, F. Van Petegem, R. J. Solaro, S. Lindert, and G. F. Tibbits, Journal of Biological Chemistry 296, 100350 (2021).
Huang et al. [2017] J. Huang, S. Rauscher, G. Nawrocki, T. Ran, M. Feig, B. L. de Groot, H. Grubmüller, and A. D. MacKerell, Nature Methods 14, 71 (2017).
Nencini et al. [2024] R. Nencini, C. Tempra, D. Biriukov, M. Riopedre-Fernandez, V. Cruces Chamorro, J. Polák, P. E. Mason, D. Ondo, J. Heyda, O. H. S. Ollila, P. Jungwirth, M. Javanainen, and H. Martinez-Seara, Journal of Chemical Theory and Computation 20, 7546 (2024).
Amaro et al. [2025] R. E. Amaro, J. Åqvist, I. Bahar, F. Battistini, A. Bellaiche, D. Beltran, P. C. Biggin, M. Bonomi, G. R. Bowman, R. A. Bryce, G. Bussi, P. Carloni, D. A. Case, A. Cavalli, C.-E. A. Chang, T. E. Cheatham, M. S. Cheung, C. Chipot, L. T. Chong, P. Choudhary, G. A. Cisneros, C. Clementi, R. Collepardo-Guevara, P. Coveney, R. Covino, T. D. Crawford, M. Dal Peraro, B. L. de Groot, L. Delemotte, M. De Vivo, J. W. Essex, F. Fraternali, J. Gao, J. L. Gelpí, F. L. Gervasio, F. D. González-Nilo, H. Grubmüller, M. G. Guenza, H. V. Guzman, S. Harris, T. Head-Gordon, R. Hernandez, A. Hospital, N. Huang, X. Huang, G. Hummer, J. Iglesias-Fernández, J. H. Jensen, S. Jha, W. Jiao, W. L. Jorgensen, S. C. L. Kamerlin, S. Khalid, C. Laughton, M. Levitt, V. Limongelli, E. Lindahl, K. Lindorff-Larsen, S. Loverde, M. Lundborg, Y. L. Luo, F. J. Luque, C. I. Lynch, A. D. MacKerell, A. Magistrato, S. J. Marrink, H. Martin, J. A. McCammon, K. Merz, V. Moliner, A. J. Mulholland, S. Murad, A. N. Naganathan, S. Nangia, F. Noe, A. Noy, J. Oláh, M. L. O’Mara, M. J. Ondrechen, J. N. Onuchic, A. Onufriev, S. Osuna, G. Palermo, A. R. Panchenko, S. Pantano, C. Parish, M. Parrinello, A. Perez, T. Perez-Acle, J. R. Perilla, B. M. Pettitt, A. Pietropaolo, J.-P. Piquemal, A. B. Poma, M. Praprotnik, M. J. Ramos, P. Ren, N. Reuter, A. Roitberg, E. Rosta, C. Rovira, B. Roux, U. Rothlisberger, K. Y. Sanbonmatsu, T. Schlick, A. K. Shaytan, C. Simmerling, J. C. Smith, Y. Sugita, K. Świderek, M. Taiji, P. Tao, D. P. Tieleman, I. G. Tikhonova, J. Tirado-Rives, I. Tuñón, M. W. van der Kamp, D. van der Spoel, S. Velankar, G. A. Voth, R. Wade, A. Warshel, V. V. Welborn, S. D. Wetmore, T. J. Wheeler, C. F. Wong, L.-W. Yang, M. Zacharias, and M. Orozco, Nature Methods 22, 641 (2025).
Gelman et al. [1995] A. Gelman, J. B. Carlin, H. S. Stern, D. B. Dunson, A. Vehtari, and D. B. Rubin, Bayesian Data Analysis, 3rd ed. (Chapman and Hall/CRC, 1995).
van de Schoot et al. [2021] R. van de Schoot, S. Depaoli, R. King, B. Kramer, K. Märtens, M. G. Tadesse, M. Vannucci, A. Gelman, D. Veen, J. Willemsen, and C. Yau, Nature Reviews Methods Primers 1, 1 (2021).
Angelikopoulos et al. [2012] P. Angelikopoulos, C. Papadimitriou, and P. Koumoutsakos, The Journal of Chemical Physics 137, 144103 (2012).
Dutta et al. [2018] R. Dutta, Z. F. Brotzakis, and A. Mira, The Journal of Chemical Physics 149, 154110 (2018).
Virtanen et al. [2020] P. Virtanen, R. Gommers, T. E. Oliphant, M. Haberland, T. Reddy, D. Cournapeau, E. Burovski, P. Peterson, W. Weckesser, J. Bright, S. J. van der Walt, M. Brett, J. Wilson, K. J. Millman, N. Mayorov, A. R. J. Nelson, E. Jones, R. Kern, E. Larson, C. J. Carey, İ. Polat, Y. Feng, E. W. Moore, J. VanderPlas, D. Laxalde, J. Perktold, R. Cimrman, I. Henriksen, E. A. Quintero, C. R. Harris, A. M. Archibald, A. H. Ribeiro, F. Pedregosa, and P. van Mulbregt, Nature Methods 17, 261 (2020).
Rasmussen and Williams [2008] C. E. Rasmussen and C. K. I. Williams, Gaussian Processes for Machine Learning, 3rd ed., Adaptive Computation and Machine Learning (MIT Press, Cambridge, Mass., 2008).
Sullivan et al. [2025] H. W. Sullivan, B. L. Shanks, M. Cervenka, and M. P. Hoepfner, Physics-Informed Gaussian Process Inference of Liquid Structure from Scattering Data (2025), arXiv:2507.07948 [physics] .
Sundararajan and Keerthi [2001] S. Sundararajan and S. S. Keerthi, Neural Computation 13, 1103 (2001).
Paszke et al. [2019] A. Paszke, S. Gross, F. Massa, A. Lerer, J. Bradbury, G. Chanan, T. Killeen, Z. Lin, N. Gimelshein, L. Antiga, A. Desmaison, A. Köpf, E. Yang, Z. DeVito, M. Raison, A. Tejani, S. Chilamkurthy, B. Steiner, L. Fang, J. Bai, and S. Chintala, in Proceedings of the 33rd International Conference on Neural Information Processing Systems, 721 (Curran Associates Inc., Red Hook, NY, USA, 2019) pp. 8026–8037.
Foreman-Mackey et al. [2013] D. Foreman-Mackey, D. W. Hogg, D. Lang, and J. Goodman, Publications of the Astronomical Society of the Pacific 125, 306 (2013).
Goodman and Weare [2010] J. Goodman and J. Weare, Communications in Applied Mathematics and Computational Science 5, 65 (2010).
Kühne et al. [2020] T. D. Kühne, M. Iannuzzi, M. Del Ben, V. V. Rybkin, P. Seewald, F. Stein, T. Laino, R. Z. Khaliullin, O. Schütt, F. Schiffmann, D. Golze, J. Wilhelm, S. Chulkov, M. H. Bani-Hashemian, V. Weber, U. Borštnik, M. Taillefumier, A. S. Jakobovits, A. Lazzaro, H. Pabst, T. Müller, R. Schade, M. Guidon, S. Andermatt, N. Holmberg, G. K. Schenter, A. Hehn, A. Bussy, F. Belleflamme, G. Tabacchi, A. Glöß, M. Lass, I. Bethune, C. J. Mundy, C. Plessl, M. Watkins, J. VandeVondele, M. Krack, and J. Hutter, The Journal of Chemical Physics 152, 194103 (2020).
Perdew et al. [1996] J. P. Perdew, K. Burke, and M. Ernzerhof, Physical Review Letters 77, 3865 (1996).
Zhang and Yang [1998] Y. Zhang and W. Yang, Physical Review Letters 80, 890 (1998).
Grimme et al. [2010] S. Grimme, J. Antony, S. Ehrlich, and H. Krieg, The Journal of Chemical Physics 132, 154104 (2010).
Kostal et al. [2023b] V. Kostal, P. E. Mason, H. Martinez-Seara, and P. Jungwirth, The Journal of Physical Chemistry Letters 14, 4403 (2023b).
VandeVondele and Hutter [2007] J. VandeVondele and J. Hutter, The Journal of Chemical Physics 127, 114105 (2007).
Goedecker et al. [1996] S. Goedecker, M. Teter, and J. Hutter, Physical Review B 54, 1703 (1996).
Bussi et al. [2007] G. Bussi, D. Donadio, and M. Parrinello, The Journal of Chemical Physics 126, 014101 (2007).
Abraham et al. [2015] M. J. Abraham, T. Murtola, R. Schulz, S. Páll, J. C. Smith, B. Hess, and E. Lindahl, SoftwareX 1–2, 19 (2015).
Lee et al. [2016] J. Lee, X. Cheng, J. M. Swails, M. S. Yeom, P. K. Eastman, J. A. Lemkul, S. Wei, J. Buckner, J. C. Jeong, Y. Qi, S. Jo, V. S. Pande, D. A. Case, C. L. I. Brooks, A. D. J. MacKerell, J. B. Klauda, and W. Im, Journal of Chemical Theory and Computation 12, 405 (2016).
Abascal and Vega [2005] J. L. F. Abascal and C. Vega, The Journal of Chemical Physics 123, 234505 (2005).
Bernetti and Bussi [2020] M. Bernetti and G. Bussi, The Journal of Chemical Physics 153, 114107 (2020).
Hess et al. [1997] B. Hess, H. Bekker, H. J. C. Berendsen, and J. G. E. M. Fraaije, Journal of Computational Chemistry 18, 1463 (1997).
Essmann et al. [1995] U. Essmann, L. Perera, M. L. Berkowitz, T. Darden, H. Lee, and L. G. Pedersen, The Journal of Chemical Physics 103, 8577 (1995).
Haynes [2016] W. Haynes, in CRC Handbook of Chemistry and Physics (CRC Press, Boca Raton, 2016) 97th ed., pp. 5–118–133.
Söhnel and Novotný [1985] O. Söhnel and P. Novotný, Densities of Aqueous Solutions of Inorganic Substances (Elsevier, 1985).
Kumar [2001] A. Kumar, Journal of Solution Chemistry 30, 281 (2001).
Assarsson and Eirich [2002] P. Assarsson and F. R. Eirich, Properties of amides in aqueous solution. I. Viscosity and density changes of amide-water systems. An analysis of volume deficiencies of mixtures based on molecular size differences (mixing of hard spheres) (2002).
Spyracopoulos et al. [1997] L. Spyracopoulos, M. X. Li, S. K. Sia, S. M. Gagné, M. Chandra, R. J. Solaro, and B. D. Sykes, Biochemistry 36, 12138 (1997).
Salari et al. [2018] R. Salari, T. Joseph, R. Lohia, J. Hénin, and G. Brannigan, Journal of Chemical Theory and Computation 14, 6560 (2018).
Duboué-Dijon and Hénin [2021] E. Duboué-Dijon and J. Hénin, The Journal of Chemical Physics 154, 204101 (2021).
Simonson and Roux [2016] T. Simonson and B. Roux, Molecular Simulation 42, 1090 (2016).
Fiorin et al. [2013] G. Fiorin, M. L. Klein, and J. Hénin, Molecular Physics 111, 3345 (2013).
Fiorin et al. [2024] G. Fiorin, F. Marinelli, L. R. Forrest, H. Chen, C. Chipot, A. Kohlmeyer, H. Santuz, and J. Hénin, The Journal of Physical Chemistry B 128, 11108 (2024).