# How to do inference with two data groups and single discrepancy?

Hi

I am trying to figure out how to represent my problem in UQLab. My model consists of two random variables and a parameter leading to a single output. It’s given as:

\tau(t) = \tau_0 - Ct,

where \tau \in \mathbb{R}^1, \tau_0\sim \mathcal{N}(\mu = 50, \sigma = 0.61), C\sim \mathcal{LN}(\mu_{\ln C} = -2.73, \sigma_{\ln C} = 0.47), discrepancy \varepsilon \sim \mathcal{N}(0, \sigma_{\varepsilon}), , \sigma_{\varepsilon} \sim \mathcal{LN}(\mu = 1, \sigma = 5), and t is a deterministic parameter.

I have two data groups for t = 10 and 15, respectively. If I create two data groups then I end up having two different discrepancies, which I don’t want in my problem.

For this problem, I specified the model as a single model with two outputs with t = 10 and 15 as follows:

func = @(X) [X(1)-X(2)*10, X(1)-X(2)*15];
ModelOpts.Name = 'Forward model';
ModelOpts.mHandle= func;
myForwardModel = uq_createModel(ModelOpts);


Based on some suggestions in this forum, I converted my data into a single datagroup and specified MOMap as follows:

% group 1 -- measurements at year 10 and 15
Data(1).y = [ut_10(:,2)', ut_15(:,2)'];
Data(1).Name = 'Thickness measurements at year 10';
Data(1).MOMap = [ones(1, l_10+l_15); ones(1, l_10), ones(1,l_15)*2]; % Model Output Map
DiscrepancyPriorOpts.Name = 'Prior of sigma at year 10 and 15';
DiscrepancyPriorOpts.Marginals(1).Name = 'Sigma1_2';
DiscrepancyPriorOpts.Marginals(1).Type = 'Lognormal';
DiscrepancyPriorOpts.Marginals(1).Parameters = [-3.26, sqrt(2)*1.81]; % Product of two logn rvs: sigma * sigma
DiscrepancyPrior10 = uq_createInput(DiscrepancyPriorOpts);
DiscrepancyOpts(1).Type = 'Gaussian';
DiscrepancyOpts(1).Prior = DiscrepancyPrior10;


But in the output of the analysis I think there are some problems. Firstly, I would expect that the model outputs are 2. Secondly, I expect that the total number of observations should be l_10 + l_15 = 1418. The analysis output is given below:

 %----------------------- Inversion output -----------------------%
Number of calibrated model parameters:         2
Number of non-calibrated model parameters:     0

Number of calibrated discrepancy parameters:   1

%------------------- Data and Discrepancy
%  Data-/Discrepancy group 1:
Number of independent observations:            1

Discrepancy:
Type:                                       Gaussian
Discrepancy family:                         Scalar
Discrepancy parameters known:               No

Associated outputs:
Model 1:
Output dimensions:                       1
to
1
2
to
2


What am I doing wrong?

1 Like

Are you referring to the discussion on this topic? This is a work-around that should in principle produce the right results, but is not officially supported by UQLab.

However, I think that it should also work in your case, and to me it looks like everything went well inside the analysis. Do you get unexpected results?

The issues you mention can be easily explained:
The number of model outputs are 2, although I agree that

1
to
1
2
to
2


should be better formatted as

1
to
2


The number of observations in the UQLab sense is only 1, because Data(1).y contains only one row. I suggest you have a look inside uq_inversion_likelihood, to see how UQLab compares your data to the model outputs.

Let me know if you have further questions.

1 Like

Dear @Shihab_Khan,

a minor remark: Since I managed to confuse myself sometimes by updating the contents of a UQLab object without updating the name I suggest to replace

Data(1).Name = 'Thickness measurements at year 10';


by

Data(1).Name = 'Thickness measurements at year 10 and  at year 15';

1 Like

Hi @paulremo ,
I don’t have any reasons to doubt my results except the two issues which I’ve already shared with you. And yes, I am referring to that very topic.

But I do have a question. From the Manual 3.1.1, it seems that the observations are to be provided as a column vector. The same can be observed in the beam example provided in the Manual in Section 2.3. So then why do we convert the data as a row vector in this workaround?

Thank you very much for the input @olaf.klein . I agree with you and have made the required change

The manual is correct but, as I mentioned before, supplying all measurements as a row vector is a workaround . It is all about achieving what you want with the tools currently implemented in UQLab.

Denoting a data group by \mathcal{G}^{(g)}=\{y_g^{(1)},\cdots,y_g^{(N_{\mathcal{G}}^{(g)})}\}, such that in your case with N_{\mathrm{gr}}=2 \mathcal{Y}=\mathcal{G}^{(1)}\cup\mathcal{G}^{(2)}, the likelihood you are trying to construct is

\mathcal{L}(\mathbf{x},\sigma;\mathcal{Y}) = \prod_{g=1}^{N_{\mathrm{gr}}}\prod_{i=1}^{N_{\mathcal{G}}^{(g)}} \pi(y_g^{(i)}\vert\mathbf{x},\sigma),

with the discrepancy standard deviation denoted by \sigma and \pi(\cdot\vert\mathbf{x},\sigma) denoting a Gaussian PDF.

What you pass to UQlab is a slightly different model. You say that there is only one data group with \mathcal{Y}=\{y^{(1)},\cdots,y^{(N_{\mathcal{Y}})}\}, where N_{\mathcal{Y}}=N_{\mathcal{G}}^{(1)} + N_{\mathcal{G}}^{(2)}. The likelihod function UQLab then constructs is

\mathcal{L}(\mathbf{x},\sigma;\mathcal{Y}) = \prod_{i=1}^{N_{\mathcal{Y}}} \pi(y^{(i)}\vert\mathbf{x},\sigma),

which is equivalent to the previous likelihood.

It might be worth a try to implement the likelihood function yourself with the user-defined likelihood function feature to see what happens inside the code. This will also allow you to solve more general problems in the future.

1 Like

Thanks a lot @paulremo for the explanation. This makes total sense. I will implement a custom likelihood and play some things out for a deeper understanding.

Thanks again!