Coarse-grained molecular dynamics enables access to long length and time scales but often fails to reproduce atomistic kinetics when memory effects and slow collective motions are important. We introduce Probabilistic Forecasting for Coarse-Graining (PFCG), a machine learning framework that learns stochastic coarse-grained equations of motion directly from atomistic trajectories by formulating coarse-grained simulation as a probabilistic time-series forecasting problem with both Markovian and non-Markovian contributions. PFCG incorporates non-Markovian effects through finite trajectory history without requiring explicit memory kernels or learned effective potentials. We apply PFCG to miniproteins and polyalanine peptides and evaluate both configurational and dynamical fidelity using free energy surfaces, autocorrelation functions, and transition time scales from Markov state models. Across all systems, non-Markovian PFCG models significantly improve dynamical agreement with atomistic simulation relative to Markovian baselines while also maintaining excellent agreement with stationary distributions. Additional tests of transferability show that PFCG remains robust under sparse sampling with few transition events, but in its current formulation does not extrapolate to metastable states absent from the training set. These results highlight the importance of inductive biases at the level of equations of motion and establish PFCG as a complementary approach to existing machine learning-based coarse-graining methods for modeling biomolecular processes.
Probabilistic forecasting for coarse-grained molecular dynamics
Abstract
