Need some clarity on samples drawn in Bayesian inversion problem

Shihab_Khan · October 7, 2021, 6:11pm

Hi everyone!

I’m posting a crude structure of the Bayesian Analysis object after post-processing below.

20211007_175505000_iOS

I have three questions:

What’s the difference between myBayesianAnalysis.Results.sample, myBayesianAnalysis.Results.PostProc.PostSample, myBayesianAnalysis.Results.PostProc.PostPredSample.post, and myBayesianAnalysis.Results.PostProc.PostPredSample.postpred.
I want to use the posterior analysis samples for a reliability analysis. Which parameter can I use to increase the number of samples drawn from the posterior distribution?
myBayesianAnalysis.Results.PostProc.PostPredSample doesn’t seem to have samples drawn from the calibrated discrepancy parameter. Any way to generate them as well?

Much thanks in advance!

olaf.klein · October 8, 2021, 9:58am

Dear @Shihab_Khan,

I had asked myself similar questions last year. I read the user manual, i.e the UQLab user manual: Bayesian Inference for Model Calibration and Inverse Problemsmanual and did some deep dive into the into UQLab-Code to get some answers. Some related information can be found in the table on page 38 in the section 2.4.1.6 of the Sec 2.4 of the user manual.

My answers may not be completely correct if you are using several data groups.

a) The content of myBayesianAnalysis.Results.PostProc.PostSample is the extract of the contents of myBayesianAnalysis.Results.sample that results from ‘removing’ the burnIn phase data (see Sec 2.4.1.6 at the end of Sec 2.4 of the user manual) and the ‘bad chains’ (see Sec. 3.3 of the user manual) by the last implicit or explicit call of uq_postProcessInversion.
(The relationship between myBayesianAnalysis.Results.PostProc.PostModel.evaluation and
myBayesianAnalysis.Results.ForwardModel.evaluation is the same.)
b) The contents of myBayesianAnalysis.Results.PostProc.PostPredSample.post are randomly chosen from myBayesianAnalysis.Results.ForwardModel.evaluation during the evaluation of uq_postProcessInversion.
c) During the extraction in b) the corresponding discrepancy value is either the given discrepancy value or derived from the corresponding sample set in myBayesianAnalysis.Results.PostProc.PostSample. Using the content of
myBayesianAnalysis.Results.PostProc.PostPredSample.post as mean value and computing the standard deviation from the discrepancy value to get a random variable, a sample this random variable is drawn and stored in myBayesianAnalysis.Results.PostProc.PostPredSample.postpred.
You can increase the value for posteriorPredictive when calling uq_postProcessInversion (see Sec. 3.3 of the user manual) to extract more data from a already performed inverse computation.
But, if you need more data values/ data sets as there are available in myBayesianAnalysis.Results.ForwardModel.evaluation you
have to re-perform the inverse computation before extracting the data, i.e. you have to call
uq_createAnalysis(BayesOpts) again after increasing/defining the value for BayesOpts.Solver.MCMC.Steps and /or BayesOptsSolver.MCMC.NChains, depending on your preferred sampler.
In view of 1c), it have to point out that my method to get samples from the calibrated discrepancy parameter combined with a mean of 0 is to compute the difference of myBayesianAnalysis.Results.PostProc.PostPredSample.postpred and
myBayesianAnalysis.Results.PostProc.PostPredSample.post.

Hope that this helps.

Olaf

P.S.: Dear @paulremo: If I my description above contains error, I would be grateful, if you could point them out. Thanks in advance.

paulremo · October 8, 2021, 1:56pm

Hi @Shihab_Khan

I can confirm that what @olaf.klein wrote is true entirely. As he correctly pointed out, there is only a slight difference when using N_{\mathrm{gr}} data groups: In this case, the structures .PriorPredSample and .PostPredSample are structure arrays with N_{\mathrm{gr}} structures corresponding to the predictive distributions.

While reviewing this, I realized that the naming of .PriorPredSample and .PostPredSample fields (i.e., Prior and PriorPred, Post and PostPred) is not ideal. In the next version of the module, these names will therefore be updated. For .PriorPredSample I suggest the field names .Sample and .ModelEvaluations to replace .PriorPred and .Prior, respectively.

The workaround proposed @olaf.klein to get a discrepancy sample indeed works for the standard additive Gaussian discrepancy scenario. To avoid the need for workarounds in the future, I will include a new field .Discrepany inside .PostPredSampleand .PriorPredSample that contains the discrepancy and will be updated in future versions to also correctly treat non-additive discrepancy cases. The manual will also be updated for the next version to include a proper table discussing the fields of the .PostProc structure in the reference list section.

I hope this helps and let me know if you have any further questions.

Shihab_Khan · October 10, 2021, 1:26pm

Thank you very much. This really helps a lot.

Just one more thing.

I have a problem in which the model \mathcal{M} = X_1 + X_2*10 with a simple gaussian additive discrepancy with a lognormal prior. So basically, two variables and one discrepancy parameter are calibrated with N_{gr} =1.

Why do myBayesianAnalysis.Results.PostProc.PostPredSample.Post and myBayesianAnalysis.Results.PostProc.PostPredSample.PostPred contain only samples for 1 variable instead of two or three (if it included the discrepancy)?

Are these the samples for X_1 or \mathcal{M}? If it is the latter, then what’s the primary difference between myBayesianAnalysis.Results.PostProc.PostPredSample and myBayesianAnalysis.Results.PostProc.PostModel?

Thanks a lot.

Shihab_Khan · October 10, 2021, 1:29pm

Thanks Paul. I can’t ask for anything more. Kudos to the UQLab team!

paulremo · October 11, 2021, 7:25am

Hi @Shihab_Khan

The samples inside .PostPredSample live in the one-dimensional model output space. The sample in .PostPredSample.PostPred (as discussed, this naming is not ideal and will be changed in the next release) are a sample of the posterior predictive distribution defined in the user manual in section 1.2.6.

Regarding the difference between the two samples you mention:

The sample in .PostPredSample.Post are the model evaluations that were used to compute the .PostPredSample.PostPred.
The sample .PostProc.PostModel are the model evaluations at the post-processed posterior sample .PostProc.PostSample.

In practice, .PostPredSample.Post is a subset of .PostProc.PostSample.

Hope this helps!