A Theoretical Laboratory for Mesoscale Biophysics
Biology is influenced by processes from the quantum to the macroscopic scale, a fundamental challenge to studying biophysical dynamics. We study the time-evolution of biological systems at mesoscales, where the molecular meets the macroscopic.
We use techniques from statistical inference and machine learning to sample, coarsegrain, and interpret complex chemical and biophysical models. We also characterize and analyze numerical methods to guarantee accuracy and robustness.
We study the self-organization of biophysical and chemical species as dictate by intrinsic material properties. Additionally, we are interested in analyzing self-assembly in nonequilibrium conditions as well as exploiting external control to dictate the outcomes of an assembly process.
Notes on research and other topics of interest
Connections to Machine Learning $$ \nonumber \newcommand{\xb}{\boldsymbol{x}} \newcommand{\yb}{\boldsymbol{y}} \newcommand{\zb}{\boldsymbol{z}} \newcommand{\thetab}{\boldsymbol{\theta}} \newcommand{\grad}{\nabla} \newcommand{\RR}{\mathbb{R}} $$ In my previous post, I introduced the notion of proximal gradient descent and explained the way in which the “geometry” or the metric used in the proximal scheme allows us to define gradient flows on arbitrary metric spaces. This concept is important in the context of statistical mechanics because analysis of the Fokker-Planck equation naturally yields a gradient flow in the Wasserstein metric.
Wasserstein Gradient Flows and the Fokker Planck Equation (Part I) $$ \nonumber \newcommand{\xb}{\boldsymbol{x}} \newcommand{\yb}{\boldsymbol{y}} \newcommand{\grad}{\nabla} \newcommand{\RR}{\mathbb{R}} $$ The connection between partial differential equations arising in chemical physics, like the Fokker-Planck equation discussed below, and the notions of distance in the space of probability measures is a relatively young set of mathematical ideas. While the theory of gradient flows of arbitrary metric spaces can get exceedingly intricate, the fundamental ideas are not unapproachable.
The goal of emulating biology has stimulated many investigations into properties that lead to robust assembly. Many of the works that focus on designing self-assembly have pursued a strategy of tuning the interactions among the various components to stabilize a given target structure. In this talk, I will discuss some work that approaches this problem from a distinct viewpoint, namely, through the lens of nonequilibrium control processes in which external perturbations are tuned to drive the assembly dynamics to states unreachable in equilibrium. I will discuss computational strategies for carrying out an optimization to control the steady state of interacting particle systems using only external fields. In addition, I will introduce a framework to quantify the dissipative costs of maintaining a nonequilibrium steady state that uses observable properties alone.
In probability theory, the notion of weak convergence is often used to describe two equivalent probability distributions. This metric requires equivalence of the average value of well-behaved functions under the two probability distributions being compared. In coarse-grained modeling, Noid and Voth developed a thermodynamic equivalence principle that has a similar requirement. Nevertheless, there are many functions of the fine-grained system that we simply cannot evaluate on the coarse-grained degrees of freedom. In this talk, I will describe an approach that combines accelerated sampling of a coarse-grained model with invertible neural networks to invert a coarse-graining map in a statistically precise fashion. I will show that for non-trivial biomolecular systems, we can recover the fine-grained free energy surface from coarse-grained sampling.
In many applications in computational physics and chemistry, we seek to estimate expectation values of observables that yield mechanistic insight about reactions, transitions, and other “rare” events. These problems are often plagued by metastability; slow relaxation between metastable basins leads to slow convergence of estimators of such expectations. In this talk, I will focus on efforts to exploit developments in generative modeling to sample distributions that are challenging to sample with local dynamics (e.g., MCMC or molecular dynamics) due to metastability. I will discuss the problem of sampling when there is not a large, pre-existing data set on which to train. By simultaneously sampling with traditional methods and learning a sampler, we assess the prospects of neural network driven sampling to accelerate convergence and to aid exploration of high-dimensional distributions. This is joint work with Marylou Gabrié and Eric Vanden-Eijnden.