Multilevel uncertainty quantification with Bayesian inversion

Hello all,

I am trying to solve the Bayesian multilevel model updating approach, which was presented by Nagel and Sudret in [1]. I am applying the example of the paper (Section 8) using the marginalised formulation proposed. The example consists in a simply supported beam, in which variability is in the elastic moduli Ei across the beam (i=1,…,n – where n is the number of experiments). The deflection of the beam is calculated as followed:

Displacement = yi = @(x) F.*x/2./(48*E*I).*(3*L^2-4*(x).^2)  

where x is the position along the main axis of the beam (I am considering 3 positions along the beam: 0.25 m, 0.5 m, and 0.75 m). The data of the beam are:

L = 1; h = 0.1; b = 0.1; I = 1/12*b*h^3; F = 30e3; 

The Young’s moduli (E_i) are independently sampled from a lognormal distribution with:
\mu_E = 15e9 Pa and \sigma_E = 3e9 Pa.

The experimental data are perturbed with a noise term, which is sampled from a Gaussian distribution with \mu=0 and \sigma = 0.1 mm. The quantity of interest is the hyperparameter \theta_E = (\mu_E,\sigma_E) ((E_i are defined as nuisance variables in the paper).

The first step in applying the multilevel approach is the definition of a suitable prior distribution \pi(\theta_E), in this case, the marginals are independent, so the Bayesian prior is \pi(\theta_E) = \pi(\mu_E)\pi(\sigma_E), these marginals are given in uniform distributions:

\pi(\mu_E)=U[0,100]GPa and \pi(\sigma_E)=U[0,30]GPa

The distributions f(E_i|\theta_E)=LN(E_i|\mu_E,\sigma_E) and \pi(\theta_E) represent the available structural and parametric prior knowledge, respectively. The posterior has the form: \pi(\theta_E|y_i) where y_i are the experimental data (displacement measurements). It can be directly sampled from the marginals of the posterior \pi(<E_i>,\theta_E|<y_i>). The next step is the definition of the posterior density (like for any Bayesian updating), which is marginalised; moreover, thanks to the marginal prior distribution, the unscaled version of the marginal posterior may be written as:

\pi(\theta_E|y_i) = L(\theta_E;<y_i>)\pi(\theta_E)

The next step is to calculate the marginalised likelihood to sample the marginal posterior. Since the marginalised likelihood is a product of integrals, it is not possible to perform the marginalisation analytically, so the stochastic integration via Monte Carlo method is applied.

I am trying to solve the example with the marginalised formulation instead of the joint one because I would like to apply the marginalised formulation of the problem in the application I am working on. However, the standard deviation of the hyperparameter \theta_E is not correctly identified (the solution jumps close to zero). I am not sure why this happens. Instead, the mean is perfectly obtained. Following the example of the paper, I applied Markov Chain Monte Carlo sampling (Metropolis-Hastings algorithm), and the marginalised likelihood is solved with Monte Carlo integration. These are the option for the solution:

Solver.Type = 'MCMC';
Solver.MCMC.Sampler = 'MH';
Solver.MCMC.Steps = 5e3;
Solver.MCMC.NChains = 5;
Solver.MCMC.Proposal.PriorScale = 1e-3;

Here is the solution I obtained,

%------------------- Posterior Marginals
| Parameter | Mean    | Std     | (0.025-0.97) Quant. | Type  |
| meanE     | 1.5e+10 | 6.2e+08 | (1.4e+10 - 1.6e+10) | Model |
| stdE      | 5e+08   | 3.7e+08 | (2.6e+07 - 1.4e+09) | Model |

%------------------- Point estimate
| Parameter | Mean    | Parameter Type |
| meanE     | 1.5e+10 | Model          |
| stdE      | 5e+08   | Model          |

%------------------- Correlation matrix (model parameters)
|       |  meanE    stdE  |
| meanE |  1        0.13  |
| stdE  |  0.13     1     |

%------------------- Convergence
   Gelman-Rubin MPSRF:                            1.005291e+00

As I said above, the mean of the hyperparameters is perfectly detected, instead, the standard deviation is much lower than expected. Do you know what I am doing wrong?

I am sorry for this very long post; I hope everything is well explained (I attached the codes (the loglikelihood and the main code) – there is also the solution I obtained since the simulation took some time to run).

Let me know if you need more explanation,

Many thanks in advance,


[1] Nagel, Joseph B., and Bruno Sudret. “A unified framework for multilevel uncertainty quantification in Bayesian inverse problems.” Probabilistic Engineering Mechanics 43 (2016): 68-84. Redirecting (775.4 KB)

Hi @Irma,

Thanks for posting here. I had a look at your code and found a few issues. The problem mainly comes from the log-likelihood function, defined as the negative mean-squared error between the prediction and the data. This is incorrect. To derive it, let us denote a measurement point by \boldsymbol{y}=(y_1,y_2,y_3)^{\mathsf{T}} and the computational model by \mathcal{M}. Using the law of total probability, the probability density function of the displacements given the hyperparameters \boldsymbol{\theta} = (\mu_E,\sigma_E)^{\mathsf{T}} is provided by

\begin{aligned} f(\boldsymbol{y} \mid \boldsymbol{\theta}) &= \int f(\boldsymbol{y} \mid E)f(E\mid \boldsymbol{\theta}) \,\mathrm{d}E \\ &=\int \frac{1}{(2\pi)^{3/2}\det(\Sigma)^{1/2}}\exp\left(-\frac{1}{2}\left(\boldsymbol{y} - \mathcal{M}(E)\right)^{\mathsf{T}} \Sigma^{-1}\left(\boldsymbol{y} - \mathcal{M}(E)\right)\right)f(E\mid \boldsymbol{\theta}) \,\mathrm{d}E. \end{aligned}

where \Sigma is the covariance matrix of the measurement. The integral can be evaluated by Monte Carlo simulation, that is,

f(\boldsymbol{y} \mid \boldsymbol{\theta}) =\frac{1}{(2\pi)^{3/2}\det(\Sigma)^{1/2}}\cdot\frac{1}{n}\sum_{i}^{n}\exp\left(-\frac{1}{2}\left(\boldsymbol{y} - \mathcal{M}(E_i)\right)^{\mathsf{T}} \Sigma^{-1}\left(\boldsymbol{y} - \mathcal{M}(E_i)\right)\right),

where E_1,\ldots,E_n are samples from the distribution of f(E\mid \boldsymbol{\theta}). As a result, the log-likelihood can be approximated by

\log\left(f(\boldsymbol{y} \mid \boldsymbol{\theta}) \right) = \log\left(\frac{1}{n}\sum_{i}^{n}\exp\left(-\frac{1}{2}\left(\boldsymbol{y} - \mathcal{M}(E_i)\right)^{\mathsf{T}} \Sigma^{-1}\left(\boldsymbol{y} - \mathcal{M}(E_i)\right)\right)\right) - \frac{3}{2}\log(2\pi) - \frac{1}{2}\log\left(\det(\Sigma)\right).

Note that we can ignore the last two constant terms in practice, as they do not affect the Bayesian framework when MCMC is used.
In your implementation, you exchange the sum and the log operators, which leads to the wrong expression (of mean-squared type) for the log-likelihood.

Moreover, the variance of the additive noise is 10^{-8} for all the components of \boldsymbol{y} in the data generation process, whereas you set them to be 10^{-7}\times(4.17,8.72,2.86)^{\mathsf{T}}. The noise term should be correctly defined so that the MCMC samples can reveal the probabilistic properties of the posterior distribution.

1 Like

Thank you very much for your help :slight_smile: Now my code is properly working.
I have another question about the noise term (discrepancy model). If I am using real experimental data, how may I define the noise term of the dataset? In some papers, the variance of the dataset is used, is it okay? Is there a way to calculate the discrepancy model?

Hi @Irma,

I think the variance of the dataset does not reveal the variance of the noise term. If this quantity is unknown, it might be more appropriate to introduce a prior and use the Bayesian framework to estimate jointly with the other parameters.

1 Like