Maziar Raissi | Multi-fidelity Modeling

Multi-fidelity modeling enables the seamless fusion of information from a collection of heterogeneous sources of variable accuracy and cost (e.g., noisy experimental data, computer simulations, empirical models, etc.). By learning to exploit the cross-correlation structure between these sources, one can construct predictive surrogate models that can dramatically reduce the compute time to solution. The impact of multi-fidelity modeling has already been recognized in our project on shape optimization of super-cavitating hydrofoils. The application involves the design optimization of an ultrafast marine vehicle for special naval operations.

Engineering Application: The application involves the design optimization of an ultrafast marine vehicle for special naval operations.

This problem inherits numerous and laborious challenges including the modeling of complex turbulent and multi-phase fluid flows, the solution of high-dimensional optimization problems, and the assessment of risk due to uncertainty in environmental and operational conditions. Here, the introduction of multi-fidelity modeling enables us to combine high-fidelity turbulent multi-phase flow simulations, experimental data, and simplified low-fidelity models (e.g., potential flow simulations), and efficiently tackle this large-scale optimization task that currently seems daunting to any other approach.

Deep Multi-fidelity Gaussian Processes

A simple way to explain the main idea of this work is to consider the following structure:

\[\begin{bmatrix} f_1(h) \\ f_2(h) \end{bmatrix} \sim \mathcal{GP}\left( \begin{bmatrix} 0 \\ 0 \end{bmatrix} , \begin{bmatrix} k_{1}(h,h') & \rho k_{1}(h,h') \\ \rho k_{1}(h,h') & \rho^2 k_{1}(h,h') + k_2(h,h') \end{bmatrix} \right),\]

where

\[x \longmapsto h := h(x) \longmapsto \begin{bmatrix} f_1(h(x)) \\ f_2(h(x)) \end{bmatrix}.\]

The high fidelity system is modeled by $f_2(h(x))$ and the low fidelity one by $f_1(h(x))$. We use $\mathcal{GP}$ to denote a Gaussian Process. This approach can use any deterministic parametric data transformation $h(x)$. However, we focus on multi-layer neural networks

\[h(x) := (h^L \circ \ldots \circ h^1) (x),\]

where each layer of the network performs the transformation

\[h^\ell(z) = \sigma^\ell(w^\ell z + b^\ell),\]

with $\sigma^\ell$ being the transfer function, $w^\ell$ the weights, and $b^\ell$ the bias of the layer. We use $\theta_h:= [w^1,b^1,\ldots,w^L,b^L]$ to denote the parameters of the neural network. Moreover, $\theta_1$ and $\theta_2$ denote the hyper-parameters of the covariance functions $k_1$ and $k_2$, respectively. The parameters of the model are therefore given by

\[\theta := [\rho,\theta_1,\theta_2,\theta_h].\]

Prediction

The Deep Multi-fidelity Gaussian Process structure can be equivalently written in the following compact form of a multivariate Gaussian Process

\[\begin{bmatrix} f_1(h) \\ f_2(h) \end{bmatrix} \sim \mathcal{GP}\left( \begin{bmatrix} 0 \\ 0 \end{bmatrix} , \begin{bmatrix} k_{11}(h,h') & k_{12}(h,h') \\ k_{21}(h,h') & k_{22}(h,h') \end{bmatrix} \right)\]

with $k_{11} \equiv k_1, k_{12} \equiv k_{21} \equiv \rho k_1$, and $k_{22} \equiv \rho^2 k_1 + k_2$. This can be used to obtain the predictive distribution

\[p\left(f_2(h(x_*))|x_*,\mathbf{x_1},\mathbf{f}_1,\mathbf{x_2},\mathbf{f}_2\right)\]

of the surrogate model for the high fidelity system at a new test point $x_*$. Note that the terms $k_{12}(h(x),h(x'))$ and $k_{21}(h(x),h(x'))$ model the correlation between the high-fidelity and the low-fidelity data and therefore are of paramount importance. The key role played by $\rho$ is already well-known in the literature. Along the same lines one can easily observe the effectiveness of learning the transformation function $h(x)$ jointly from the low fidelity and high fidelity data.

We obtain the following joint density:

\[\begin{bmatrix} f_2(h(x_*)) \\ \mathbf{f}_1 \\ \mathbf{f}_2 \end{bmatrix} \sim \mathcal{N}\left( \begin{bmatrix} 0 \\ \mathbf{0} \\ \mathbf{0} \end{bmatrix}, \begin{bmatrix} k_{22}(h_*,h_*) & k_{21}(h_*,\mathbf{h}_1) & k_{22}(h_*,\mathbf{h}_2) \\ k_{12}(\mathbf{h}_1,h_*) & k_{11}(\mathbf{h}_1,\mathbf{h}_1) & k_{12}(\mathbf{h}_1,\mathbf{h}_2) \\ k_{22}(\mathbf{h}_2,h_*) & k_{21}(\mathbf{h}_2,\mathbf{h}_1) & k_{22}(\mathbf{h}_2,\mathbf{h}_2) \end{bmatrix} \right),\]

where $h_* = h(x_*)$, $\mathbf{h}_1 = h(\mathbf{x}_1)$, and $\mathbf{h}_2 = h(\mathbf{x}_2)$. From this, we conclude that

\[p\left(f_2(h(x_*))|x_*,\mathbf{x_1},\mathbf{f}_1,\mathbf{x_2},\mathbf{f}_2\right) = \mathcal{N}\left(K_* K^{-1} \mathbf{f}, k_{22}(h_*,h_*) - K_* K^{-1} K_*^T\right),\]

where

\[\mathbf{f} := \begin{bmatrix} \mathbf{f}_1 \\ \mathbf{f}_2 \end{bmatrix},\] \[K_* := \begin{bmatrix} k_{21}(h_*,\mathbf{h}_1) & k_{22}(h_*,\mathbf{h}_2) \end{bmatrix},\] \[K := \begin{bmatrix} k_{11}(\mathbf{h}_1,\mathbf{h}_1) & k_{12}(\mathbf{h}_1,\mathbf{h}_2) \\ k_{21}(\mathbf{h}_2,\mathbf{h}_1) & k_{22}(\mathbf{h}_2,\mathbf{h}_2) \end{bmatrix}.\]

Training

The negative log marginal likelihood $\mathcal{L}(\theta) := -\log p\left( \mathbf{f} | \mathbf{x}\right)$ is given by \begin{eqnarray}\label{Likelihood} \mathcal{L}(\theta) = \frac12 \mathbf{f}^T K^{-1}\mathbf{f} + \frac12 \log \left| K \right| + \frac{n_1 + n_2}{2}\log 2\pi, \end{eqnarray} where

\[\mathbf{x} := \begin{bmatrix} \mathbf{x}_1 \\ \mathbf{x}_2 \end{bmatrix}.\]

The negative log marginal likelihood along with its Gradient can be used to estimate the parameters $\theta$.

Deep Multi-fidelity Gaussian Processes predictive mean and two standard deviations.

Conclusions

We devised a surrogate model that is capable of capturing general discontinuous correlation structures between the low- and high-fidelity data generating processes. The model’s efficiency in handling discontinuities was demonstrated using benchmark problems. Essentially, the discontinuity is captured by the neural network. The abundance of low-fidelity data allows us to train the network accurately. We therefore need very few observations of the high-fidelity data generating process.

Acknowledgments

This work was supported by the DARPA project on Scalable Framework for Hierarchical Design and Planning under Uncertainty with Application to Marine Vehicles (N66001-15-2-4055).

Citation

@article{raissi2016deep,
  title={Deep Multi-fidelity Gaussian Processes},
  author={Raissi, Maziar and Karniadakis, George},
  journal={arXiv preprint arXiv:1604.07484},
  year={2016}
}