Clarification on UQ example on inversion with user defined likelihood function

Shihab_Khan · October 14, 2019, 9:58am

Hi UQWorld!

My question pertains to the using user-defined likelihood function for inversion. As per my understanding from the when we’re giving a user defined likelihood function, there are only three requirements (please correct me if I’m wrong) on the user supplied log-likelihood function:

It must have two arguments: params and y
The size of params should be C\times M, where C is the number of MCMC chains and M is the number of parameters including the discrepancy terms
The output should be the log-likelihood vector of size C\times 1

I have gone through the example provided in the documentation and I have the following questions (please note that I have only single parameter model):

How should I specify the discrepancy term? In my case, the true quantity R_i is estimated with a zero mean gaussian discrepancy. This estimation, X_m, is reported back as an interval i.e. y_i \sim a_{m_j} \leq x_{m_i} < a_{m_{j+1}}. Should I specify the discrepancy as,

 PriorOpts.Marginals(2).Name = 'sigma2';
 PriorOpts.Marginals(2).Type = 'Gaussian';
 PriorOpts.Marginals(2).Parameters = [0, eps^2];
or 
 PriorOpts.Marginals(2).Name = 'sigma2';
 PriorOpts.Marginals(2).Type = 'Constant';
 PriorOpts.Marginals(2).Parameters = eps^2;

If I define it as Gaussian, I notice that I have negative values in the log-likelihood function argument params(:,2). As per my understanding, the likelihood should be evaluated while keeping the Gaussian distribution centered at params(:,1) with standard deviation fixed to params(:,2). Am I missing out on something here?
If I specify it as a constant, then UQLab doesn’t supply anything in params(:,2). What’s wrong?

I think I have messed up in understanding the terms somewhere. Am I doing it right?

paulremo · October 14, 2019, 4:16pm

Hi Shihab

The possibility to specify user-defined likelihood functions was added so that complex discrepancy models (e.g. non-Gaussian, parameterized covariance materix etc.) could be handled with the Bayesian module in UQLab.

I don’t fully understand your question, because it seems that you are trying to use a standard Gaussian discrepancy model. In this case you do not need a user-defined likelihood function but can use the standard UQLab discrepancy model options. Could you maybe explain your discrepancy model in a bit more detail?

Shihab_Khan · October 15, 2019, 12:57am

Hi Paul

I’m measuring the instantaneous capacity of a structural component, R_t, using visual inspection. The way I’m modelling visual inspection is as follows: the engineer carrying out the visual inspection estimates R_i with a gaussian discrepancy, i.e. \log X_m = \log R_i + \epsilon and then reports back his estimates as an interval: y_i \sim a_{m_j} \leq x_{m_i} < a_{m_{j+1}} where a_{m_j} are some predefined intervals.

Now the reason why I reckoned that I needed a user-defined likelihood function is because the likelihood, in this case, would be the integration of the gaussian pdf between the limits a_{m_j} and a_{m_{j+1}} rather than a evaluation at a point. Would this be possible using UQLab? Or do I need a custom log-likelihood function?

I suppose there could be other approaches to modelling visual inspection as well. We could have a discrete parameter - discrete measurement case. Then the likelihood table would be in the form of a table. Wouldn’t I require a custom log-likelihood function in that case too?

paulremo · October 15, 2019, 3:20pm

Hi Shihab

Yes, for this type of problem you need to resort to the user-defined likelihood function feature. Generally you should at first try to pose your problem properly before jumping into the imlementation of the likelihood function.

Assume your model takes a parameter \mathbf{X}\in\mathbb{R}^M as input and predicts scalar observables Y\in\mathbb{R}. You have to initially construct a discrepancy model \pi(y\vert \mathbf{x}) that explains the uncertainty in the observables Y given a certain realization of the parameter vector \mathbf{x} = \mathbf{X}. The discrepancy model implemented in the Bayesian module of UQLab is the most common Gaussian discrepancy model

Y = \mathcal{M}(\mathbf{X}) + E, \quad \text{where} \quad E\sim\mathcal{N}(\varepsilon\vert 0,\sigma^2)

This directly yields

\pi(y\vert \mathbf{x}) = \mathcal{N}(y\vert \mathcal{M}(\mathbf{x}),\sigma^2)

In constructing your specific discrepancy model you need to ask yourself what discrepancy you expect your data to have from the model output. It seems like your data is collected as discrete values (between a_{i} and a_{i+1}). You should therefore come up with a discrete discrepancy model. When using an additive Gaussian discrepancy E, one way to do this is to integrate between the bounds a_{i} like

P(y\vert \mathbf{x}) = \int_{a_{i}}^{a_{i+1}}\mathcal{N}(a\vert \mathcal{M}(\mathbf{x}),\sigma^2)\,\mathrm{d}a, \quad \text{where} \quad y\in[a_i,a_{i+1}]

Note that P is a probability and not a density!

Under the assumption of independence between individual measurements, the likelihood function can then be constructed as

\mathcal{L}(\mathbf{x};\mathcal{Y}) = \prod_{i=1}^N P(y^{(i)}\vert \mathbf{x}), \quad \text{where} \quad \mathcal{Y}=\{y^{(i)}\}_{i=1,\dots,N}

Note that this likelihood function is discontinuous and depending on how you want to solve the inverse problem (e.g., gradient based procedures) you could run into considerable problems.

Let me know how it goes!

Shihab_Khan · October 17, 2019, 6:37am

Hi Paul

Thanks a lot for your inputs.

I realized I was confusing a number of things here. I also didn’t realize that the discrepancy parameter was also being updated in the documentation example on custom likelihood function.

Do you have some suggestions on some more problems which would require a custom likelihood function so that I can reproduce and validate UQLab and my implementations?

paulremo · October 17, 2019, 7:18am

Hi Shihab

Sure - it would be great if you could share the likelihood function you end up using!

The likelihood function is derived directly from the used discrepancy model, the most common one being the additive Gaussian discrepancy explained in the previous post. Any discrepancy model that cannot be expressed in this form, needs to enter UQLab through a user-defined likelihood function. Problems that fall under that category are:

Non-Gaussian additive discrepancy: Y = \mathcal{M}(\mathbf{X}) + E, where E is distributed according to a non-gaussian distribution.
Multiplicative discrepancy: Y = \mathcal{M}(\mathbf{X})\cdot E
Gaussian additive discrepancy with parameterized covariance matrix: \mathbf{Y} = \mathcal{M}(\mathbf{X}_{\mathcal{M}}) + \mathbf{E}, where \mathbf{E}\sim\mathcal{N}(\mathbf{\varepsilon}\vert \mathbf{0},\Sigma(\mathbf{X}_{\varepsilon}))
…

It is virtually impossible to give an exhaustive list, which is exactly why the user-defined likelihood feature was included