Bayesian inversion : limit the correlation of posterior inputs

olaf.klein · July 19, 2021, 9:17am

Dear Marc,

to maybe reduce your confusion I would like to present my private interpretation of the results of Bayesian inversion, following with modifications
J. Kaiiop, E. Somersalo, “Satistical and computational inverse Problems”, 2004, Sec 3.1:

Ignoring that your model (as all models) is not a perfect representation of the reality, I assume that there is an fixed, but unknown true parameter pair (E_{true},\lambda_{true}) such that the real world measurements (=observations) are equal to this model output with some added noise (i.e. model bias is ignored here!).
Before one deals with the observations, one had derived a continuous random variable (E_{priori},\lambda_{prior}) representing all information/guesses/beliefs/expert opinions that one had on the pair (E_{true},\lambda_{true}) before the observation S_{observation} had been made/had been evaluated.
The density of (E_{priori},\lambda_{prior}) is the prior density. In your considerations, the components of (E_{priori},\lambda_{prior}) are independent, such that the prior density is the product of the prior densities for E and \lambda.
Now, one is considering S_{observation} as an output of the model (with added noise) for a sample for (E_{priori},\lambda_{prior}). By following Bayes’s theorem one gets a formulae for the conditional probability distribution (E_{prior},\lambda_{prior} | S_{observation}) and by using UQLab, one gets a set of samples for this distribution, as shown in your first post.
This conditional probability distribution (= posterior distribution) expresses what one can deduce about (E_{true},\lambda_{true}) by combining the information encoded in the prior densities and the observation S_{observation} of the model output with noise (i.e. discrepancy) . Hence, one should use this conditional probability distribution for the computation in the following, i.e. one should focus on (E_{prior},\lambda_{prior} | S_{observation}) with its “different properties” as the prior distribution.

Concerning using an unknown discrepancy: Especially if you do not know the value for the discrepancy I would suggest to use an unknown discrepancy, even if the formulation of the Bayesian inverse problem get more complicated. If you work with a know discrepancy you have to provide the value. If this value is too large, the likelihood function may become to flat such that the MCMC algorithm may produce samples set with an uncertainty that may be too large. Hence, one may have to play around with the provided value for the discrepancy and have to check if this inflects the uncertainties in E and \lambda, and may have to try to find an optimal value for the discrepancy.
By using an unknown discrepancy, one expects (or al least hopes) that in the resulting sets of samples the value discrepancy is automatically chosen as an approximation of the optimal value and that the uncertainty in E and \lambda may be optimized.

Greetings
Olaf

,