Coupling Growth, Flow, and Optimization in Complex Systems

Cells in a large bioreactor don’t sit still: they circulate, and where they go determines what they experience. A cell passing through a high-shear zone and then drifting into an oxygen-depleted region has accumulated an exposure history that shapes whether it grows, stresses, or dies. Well-mixed models average over that history and erase the variance in exposure history that predicts whether a cell grows or fails.

Bioreactor simulation — Simulated particle trajectories inside a stirred-tank bioreactor.

Predicting population-level growth from those histories requires fusing two qualitatively different inputs: hydrodynamic exposure histories and process-state variables such as inoculum density, dissolved oxygen, and culture pH. Experiments are expensive, most operating regimes are sparsely observed, and the joint input space over both fields is large. A model that produces a confident prediction in a well-sampled regime and an equally confident one in a regime it has never seen is not useful. It is wrong in ways that are invisible until something fails.

We extend the cooperative training framework of Yi & Bessa, which disentangles aleatoric and epistemic uncertainty in single-field regression, to the multi-field setting, and build a cooperative Bayesian fusion architecture with field-specific encoders for mechanics and biology and a learned fusion map trained so that epistemic uncertainty rises only where joint coverage is sparse and registers disagreement between fields as a distinguishable signal, rather than folding it into an undifferentiated variance term. Concretely, the conflict is the posterior variance of the fused predictive mean:

\[u_\text{epi}(x_\text{mech}, x_\text{bio}) \approx \operatorname{Var}_{p(\eta \mid \mathcal{D})}\!\bigl[\mu_\eta(x_\text{mech}, x_\text{bio})\bigr]\]

That conflict signal is the key deliverable the single-field baseline does not provide: when the hydrodynamic and biological encoders give locally divergent signals, the model flags it rather than masking it.

The first test is a regression problem: given fixed-window summaries of a cell population’s hydrodynamic exposure history and process-state variables, predict biomass growth-rate deviation relative to a well-supported operating regime. The task is small enough to validate the uncertainty diagnostics carefully: does the epistemic term rise where joint coverage is sparse? Does it register source conflict rather than hide it? Those are the diagnostics the architecture must pass, while retaining the two-field structure that deterministic fusion approaches cannot preserve without collapsing into overconfident point predictions in sparse regions. The architecture targets the two-field case; whether cooperative Bayesian fusion remains well-calibrated as source fields multiply, and whether Gaussian predictive heads hold or mixture and flow-based alternatives become necessary.

References