MOMap definition for multiple models with multidimensional outputs (UQ[py]Lab - Bayesian inversion)

Dear UQWorld community,

I am trying to use the Bayesian inversion module of UQ[py]Lab to solve an inverse problem based on synthetic data.

I will describe briefly the error I encounter and then try to be as precise as possible about my test case so you can help me out. :slight_smile:

The error : when trying to run uq.createAnalysis(BayesOpts), I get the following error : ‘MOMap of data group 1 is not consistent with supplied data’

Description of my test case :

I have 5 forward models, which are PCE (already validated), reproducing the output of my physically based forward model for 5 output variables of interest (geophysical signals for 5 different acquisition parametrization).

The 5 PCE metamodels use the same set of input parameters (the parameters of my physically based forward model) and return a time-series of signals (numpy array of size Nout=66).

It is a synthetic test case in the sense that I used my forward model with a ‘True’ set of parameters to produce the data that I am using in the inversion. And I am trying to recover the ‘True’ parameter set with Bayesian inversion.

Here is how I setup the UQ[py]Lab inversion :

Forward model definition :

forwardModels = [
    {
        'Model' : myPCE['Name']
    },
    {
        'Model' : myPCE_E2['Name']
    },
    {
        'Model' : myPCE_E3['Name']
    },
    {
        'Model' : myPCE_E4['Name']
    },
    {
        'Model' : myPCE_E5['Name']
    }
]

Parameters prior distributions :

myPriorDist = uq.createInput(InputOpts)

(with InputOpts describing the prior for each parameter of the models (18 parameters)).

Measurement data:

myData = [
     # Data group 1
    {
        'y': [Y_E1_true.tolist()],
        'Name': 'E0(q1)',
        'MOMap': MOmap1.astype(int).tolist()        # Model Output map
    },
     # Data group 2
    {
        'y': [Y_E2_true.tolist()],
        'Name': 'E0(q2)',
        'MOMap': MOmap2.astype(int).tolist()        # Model Output map
    },
     # Data group 3
    {
        'y': [Y_E3_true.tolist()],
        'Name': 'E0(q3)',
        'MOMap': MOmap3.astype(int).tolist()        # Model Output map
    },
     # Data group 4
    {
        'y': [Y_E4_true.tolist()],
        'Name': 'E0(q4)',
        'MOMap': MOmap4.astype(int).tolist()        # Model Output map
    },
     # Data group 5 
    {
        'y': [Y_E5_true.tolist()],
        'Name': 'E0(q5)',
        'MOMap': MOmap5.astype(int).tolist()        # Model Output map
    },
]

With each data array “Y_EX_true” a numpy array containing the outputs of the forward model for the ‘true’ parametrization and for acquisition setup X.

And MOmapX defined as follows :

nt = Y_E1_true.shape[0]

MOmap = np.ones((nt,2))
Nout = np.array(range(nt))+1
MOmap[:,1] = Nout

MOmap1 = cp.deepcopy(MOmap)

MOmap2 = cp.deepcopy(MOmap)
MOmap2[:,0] = 2*MOmap2[:,0]
...

Which gives :
MOmap1 = [ [1,1], [1,2], …, [1,66] ]
MOmap2 = [ [2,1], [2,2], …, [2,66] ]
…

I have checked that the size of the list correspond:
len(Y_EX_true.tolist()) returns 66
and len(MOmapX.astype(int).tolist()) returns 66

For the discrepancy model, I chose unknown discrepancy: i.i.d Gaussian with unknown variance (which prior is uniform in [0,50])

SigmaOpts = {
    'Marginals': [
        {
            'Name': 'Sigma2E1',
            'Type': 'Uniform',
            'Parameters': [0,50]      # (nV^2)
        }
    ]
}

SigmaDistE1 = uq.createInput(SigmaOpts)

SigmaOpts['Marginals'][0]['Name'] = 'Sigma2E2'
SigmaDistE2 = uq.createInput(SigmaOpts)

SigmaOpts['Marginals'][0]['Name'] = 'Sigma2E3'
SigmaDistE3 = uq.createInput(SigmaOpts)

SigmaOpts['Marginals'][0]['Name'] = 'Sigma2E4'
SigmaDistE4 = uq.createInput(SigmaOpts)

SigmaOpts['Marginals'][0]['Name'] = 'Sigma2E5'
SigmaDistE5 = uq.createInput(SigmaOpts)

DiscrepancyOpts = [
    {
        'Type': 'Gaussian',
        'Prior': SigmaDistE1['Name']
    },
    {
        'Type': 'Gaussian',
        'Prior': SigmaDistE2['Name']
    },
    {
        'Type': 'Gaussian',
        'Prior': SigmaDistE3['Name']
    },
    {
        'Type': 'Gaussian',
        'Prior': SigmaDistE4['Name']
    },
    {
        'Type': 'Gaussian',
        'Prior': SigmaDistE5['Name']
    } 
    ]

By the way, does this work ? Or should I define a discrepancy model for each output of each model ?

Finally, here are the options I set for the inversion :

Solver = {
    'Type': 'MCMC',
    'MCMC': {
        'Sampler': 'AIES',
        'Steps': 300,
        'NChains': 100
    }
}

BayesOpts = {
    "Type": "Inversion",
    "Name": "TC1_1",
    "Prior": myPriorDist['Name'],
    "ForwardModel": forwardModels,
    "Data": myData,
    "Discrepancy": DiscrepancyOpts,
    "Solver": Solver
}

I hope this is detailed enough.
Maybe I got the concept of MOmap wrong… Since the example proposed in the “Multiple forward models” example is only for scalar outputs (with multiple measurements). Here I have several models, and multidimensional output (despite a single measurement), and this case is not explicitely treated in the examples.

Thanks in advance for your help in fixing this problem.
I can bring more details on my problem if needed.

Best regards,
Guillaume Gru
PhD student at ITES, Strasbourg University, France

Hi @GuillaumeGru

Thanks a lot for the detailed problem description. Would you be able to share a fully executable example as well? Dummy data or a reduced/sanitized dataset is totally fine if the original data is large or proprietary. Anything that reproduces the error is sufficient.

Best,
Styfen

1 Like

Hi @styfen.schaer,

Thank you so much for your quick reply!

I have produced a reduced version of the code that reproduces the error I described above.
I zipped it and I upload it through this reply.
Inv_TC1_1_for_dbg.zip (9.7 KB)

You will find the reduced example in the jupyter notebook “Inv_TC1_1_for_dbg.ipynb”

My PCE models are replaced by models which return (1,1,…,1), to (5,5,…,5), the same size as the original models output vectors of my problem and original data are replaced by these constant vectors.

Furthermore, in the meantime I have thought of a possible answer to my problem and tried transposing the MOMap array in order to get a structure that looks like :

[ [1, 1, …, 1],
[1, 2, …, 66] ]

Rather than : [[1,1], [1,2], …, [1,66]]

This was inspired by this post (Bayesian inference with multiple forward models, each one with different vector-valued outputs) which is about UQLab (Matlab) but the MOMap seems to be a transposed version of mine (which I defined based on UQpyLab manual : " Nested list of Nout,g lists of length 2 and type integers").

This is implemented in the jupyter Notebook “Inv_TC1_1_for_dbg_transpose.ipynb”.

With this version, I don’t get the error “MOMap of data group 1 is not consistent with supplied data” anymore, the analysis is running. Could there be an error in UQ[py]Lab manual ?

Though, with this version I get another error after intermediate computations : “Error: Index in position 2 exceeds array bounds. Index must not exceed 1.”

This last error might be due to inconsistancies in the dummy functions and data that I have quiclky defined for debugging purpose.

I try running this solution with my original problem and will come back to you if it works.

Thanks for having a look into my problem ! :slight_smile:

Best,
Guillaume Gru

Hi @GuillaumeGru,

Thanks a lot for the example, that was very helpful!

I agree that the shape information from MOMap in the manual seems wrong. I’ll discuss this internally, and we’ll fix it if it turns out to be incorrect.

Using your “transposed” code, I can also reproduce the new error. This stems from the fact that your model implementation is not vectorized, but the default configuration assumes the model isVectorized. You can fix this by updating your model code as follows:

def mod_1(X):
    if X.ndim == 1:
        return np.ones((1, Nout))
    elif X.ndim == 2:
        return np.ones((X.shape[0], Nout))
    raise ValueError(f"Unexpected shape {X.shape}")

Note that UQLab sometimes evaluates the function with a single sample, in which case X is 1D, but the output still needs to be 2D. That was unexpected to me as well, and it may not be intended behavior.

Could you try the fix above and let me know whether your analysis runs now? Unfortunately, after the analysis when using the display functionality there seems to be another issue. I will look into this later.

Thanks again for reporting this and helping with the debugging!

Best,
Styfen

Hi @styfen.schaer,

Thanks for your reply!

Indeed, your modification in the model function seems to do the trick.
On my side, after this small modification, the analysis runs smoothly and there doesn’t seem to be a problem with displaying the results: here is the jupyter notebook.
Inv_TC1_1_for_dbg_transpose_2.zip (5.0 MB)
(The jupyter notebook takes quite a while to charge because of the large number of parameters, but it seems to work fine !)

On the other hand, the bayesian inference also has worked on my actual problem.
I lost the results because the data and graphs overloaded the RAM of my computer.
Though, I could retrieve the mean, std and quantiles of parameters estimate so I am working with that while the Bayesian inversion is re-running (next time I’ll drop the MCMC results in a file and do the postprocessing with my own routines in another program :slight_smile: ).

All in all, the problem seemed to be the transposition of the MOMap vector.

Thanks a lot for your help!
Best,
Guillaume Gru
PhD student at ITES, Strasbourg university, France

1 Like

That’s great to hear, thanks for sharing!