Restarting a Bayesian Inversion

s_rosier · March 1, 2021, 8:32am

I’m looking to generate PDFs for a set of model parameters by comparing with observations. The forward model runs entirely in matlab and each model evaluation takes 2-3 hours. I’ve been putting together a Bayesian inversion using UQlab following the examples and manuals using the MCMC solver. UQlab calls a wrapper that submits 20 jobs in parallel and waits for all of those to finish to get the results I need to compare to data. This seems to work fine but currently I’ve only tested it on my workstation for 2 parameters (I’m hoping to do 6) and for 30 iterations with NChains = 20. I want to scale this up and was originally hoping to send these jobs to my university cluster via the wrapper but this isn’t working (discussion in a seperate post on these forums). As an alternative, I could run the entire analysis on the cluster rather than the individual forward model evaluations but the problem there is that the maximum time I can run a single job on the cluster is 3 days. My question: Is it possible to ‘restart’ the bayesian inversion that was stopped at e.g. iteration 60, and continue from that point in the solution? I’m sure this must be possible but I haven’t seen anything obvious on how this could be done in practice.

paulremo · March 1, 2021, 9:24am

Hi @s_rosier

In short: yes, this is possible by manually specifying the last points returned by the MCMC sampler as the new starting points for a separate MCMC analysis. You would have to write a routine that does this, but this should be straight-forward.

That being said, I would not necessarily opt for MCMC on the full model in your case. If I understand correctly, a single step of an MCMC algorithm with C=20 chains in your case will take approximately 2 days. To say the least: this is not ideal for MCMC because you might have to run the algorithm for a couple of months before obtaining anything useful.

Instead, I would suggest you give surrogate modeling a try first. The idea here is to train a surrogate model of your expensive forward model, and use this cheap replacement model in the MCMC analysis (have a look at uq_Example_Inversion_08_Surrogate). Alternatively, you could also just opt for a simple maximum-a-posteriori analysis (uq_Example_Inversion_07_MAP).

Hope this helps!

s_rosier · March 1, 2021, 10:22am

Hi @paulremo thanks for the quick reply! Perhaps I’m confused - but with 20 chains run in parallel then each iteration takes as long as a single forward model evaluation, e.g. 2-3 hours. This seems like a manageable run time to me but I would not expect the solver to have converged within 3 days. Indeed my plan is to move to surrogate modelling, and I hope my approach makes sense from an uncertainty quantification stand point which I am still trying to learn. The plan would be to do the analysis described above, starting from uniform PDFs for my model parameters to generate informed PDFs using the model evaluated over the ‘observational period’ (20 years). I would then take these PDFs and run the model forward in time for several 100s of years to train a surrogate model. This surrogate could then be used to produce uncertainty bounds for my output of interest. When planning this, I felt that the first step was fast enough that I could avoid a surrogate model, whereas the second step evaluating for several 100s of years is much slower and so will require a surrogate model.
I’m a bit confused exactly how to go about specifying the previous points in the MCMC sampler as new starting points for a second MCMC analysis, could you please explain this in a bit more detail for me? Thanks again!

paulremo · March 1, 2021, 12:30pm

I would say you should give surrogate modeling a try right away. If a single analysis takes more than 1 hour, there is surely no harm in trying out surrogate modeling techniques before potentially wasting resources.

Regarding parallel MCMC sampling, it depends on the MCMC sampler you use: The current default sampler (AIES) does not support parallel model evaluations, so I suggest you switch to MH which does. In order for your forward model myModel to be evaluated in parallel, you need to make sure that uq_evalModel(myModel, X) runs in parallel.

Finally, for reusing the last points produced by an MCMC sampler, just have a look at lastPoints = myBayesianAnalysis.Results.Sample(end,:,:). This matrix stores the points generated at the last MCMC step by all chains. For your next analysis, you then simply feed those points as a seed to the next analysis

 BayesOpts.Solver.MCMC.Seed = lastPoints

When running the next analysis, the MCMC sampler will start from this provided seed which is equivalent to never having stopped the analysis.

s_rosier · March 1, 2021, 12:35pm

Hi Paul,
Thanks so much that’s exactly what I was after! I have already switched to an alternative sampler because I ran into the limitation you described with the AIES sampler. I will also do as you suggest and start playing around with the surrogate modelling techniques.