Bayesian inversion: how should the discrepancy model be calculated?

Dear Irma Isnardi,

starting with an answer to the problem how the predictive posterior distributions is created by UQLab: For all derived sample pairs of values for RL!2 and RL23 the model output is computed and afterwards a sample for the discrepancy is added, using either the known fixed values for the variance in the different components in DiscrepancyOpts.Parameters or, if you use a unknown discrepancy following section 2.2.6.2, the posterior density sample will be computed from the sample value for the discrepancy/variance value accompanying the sample value for RL12 and RL23 as third components in the sample triple.
If you are interested in plots of the density of the model outputs without the added noise, you may use the function in the topic Further plot possibilities for results of Bayesian inversion (Suggestions and code example), at least if you are still using UQLab 1.4. I have not tested my function with UQlab V2.0 yet.

Here a remark: Since you have used 3 different discrepancy vales for the three output components of your model, you should maybe consider for each of your components a separate prior and posterior density for the discrepancy, i.e.instead of following section 2.2.6.2, you should follow the discussion an the end of section 2.3.4.2, with the modification that you will need define a marginal with three components and not only two.

If I understand correctly, you have an experiment that allows you to get observations belonging to different values for the model parameters that are samples for some controlled parameter uncertainty, and you want somehow reproduce this uncertainty by using UQLab to apply the Bayesian Theorem. If this correct, then I have bad news, since
this will not work, at least not directly. I have tried this myself in the past, and had to realize that even if in many books and literature is it claimed that one application of the Bayesian Theorem allows to extract information about some hidden parameter density generating the observations, this is not true, since one application of the Bayesian Theorem generates a posterior density containing the information on the single parameter value that created all observations since different measurements errors are considered, and not more.

For example, If one is considering a linear model, has derived a prior density for the parameter and a value for discrepancy and is dealing with observation y_1, y_2,....y_K then it holds for the posterior density following from the Bayesian Theorem, that one can get the same posterior density if one considers as observation K repetitions of the mean of y_1, y_2,....y_K.
Considering the general situation, see e.g. the derivation in Sections 1.2.2 and 1.2.3 of the manual, it holds that the model output \mathcal M(\mathbf{x}) for one paraemter value \mathbf{x} is considered as approximation of all observations, i.e. it is somehow assumed that the observations only differ because there were different values for the measurement errors between the different observations of the same value. Therefore, it holds in your computations that the density computed by UQLab represents the information that one has on the one parameter value x_{one} such that \mathcal M(\mathbf{x_{one}}) + noise creates all observations, see also my discussion in the following post Bayesian inversion : limit the correlation of posterior inputs - #4 by olaf.klein.
If you reduce the value for the discrepancy sufficiently you will get violin plots such that the many of the measured values are outside of the posterior “bubble”, i.e. the decrease in discrepancy will not increase the variation of the computed samples for the model parameter, as I believed for some time.
My solution to dealing with parameters values that are different for different observations, is to collect all observations belonging to one parameter value and to formulate the Bayesian problem on for these data. Doing this for all collections, you can perform one separate run of uq_createAnalysis for each of them, storing the different results. At the end, one could merge the sample array form the different ANALYSIS objects.

Hope that this still helps
Greetings
Olaf

2 Likes