Bayesian Linear regression

Federico_Romis · December 13, 2019, 1:32pm

Hi UQWorld!

I’m using the Bayesian Linear regression example proposed by UQLab, in order to calibrate a macroseismic model for the seismic vulnerability assessment of existing buildings on urban scale.

CONTEXT
The model that I use defines the seismic vulnerability index (Iv) of a existing building, as the sum of scores (C_i) assigned to 14 structural features, multiplied with appropriate weights (p_i), in function of the importance of the parameter in the dynamic behaviour of the construction, during a seismic event.

Seismic vulnerability index:
Iv=∑pi∙ci ,where{c_i=score of the parameter "“i”; p_i=weight of the parameter “i”)

The model is calibrated, combining the expected damage scenario, evaluated through a specific formulation that contains the seismic vulnerability index (Iv), with the real damage scenario of a case study hit by the recent 2016 Italian earthquake.

Real damage = Expected damage scenario = 2.5+3· tanh ((I+6.25 ∙ Iv -12.7)/Q)

PURPOSE OF THE CALIBRATION:
The purpose of the calibration is to change the 14 weights used for the evaluation of the seismic vulnerability of each building in an historical center composed by 67 buildings, in order to elaborate a theoretical damage scenario, similar to the one detected.

I tried to calibrate my model with different systems like the genetic algorithm, a pure monte carlo approach and optimization tools, but the Bayesian linear regression is the one that works best.

QUESTION:
Can I force the Metropolis algorithm to search only positive values that fit the problem and that are around a range of specific values?
Example: Point estimate in the posterior distribution: x1 = [1 ÷ 2] , x2 = [0,5 ÷ 1], …, x14 = [1 ÷ 2,5].

Sorry for the long explanation but I wanted to be precise. I am also available to send the model, that I have prepared if someone is interested. Any kind of anwer or suggestion is welcome

I will be highly grateful to you and thank you UQLab for you work.

Best Regards

Federico Romis

nluethen · December 13, 2019, 2:06pm

Hi Federico, welcome to UQWorld!

It sounds to me as if this should be possible through the definition of the prior distribution: If you assign a prior distribution that has nonzero probability only for positive values, then also your posterior distribution has this property. The support (=region of nonzero probability) of the posterior distribution is always a subset of the support of the prior distribution (this follows directly from Bayes’ theorem).

So you could use e.g. a uniform distribution with appropriate bounds, or Beta, Lognormal, Exponential, Gamma, Weibull… see e.g. the Appendix of the Input manual for a list of available distributions. Choose whatever fits your prior knowledge about the weights best.

I hope I understood your question correctly. Let us know whether it works out!

Federico_Romis · December 13, 2019, 3:07pm

thank you a lot Nora for the quick answer

I’ll try your suggestion and I’ll give you a feedback.

Federico_Romis · December 17, 2019, 7:04pm

Dear Nora,

I tried with different distributions and in my case the Beta distribution is the one that works better in terms of estimated points. The latter are positive and the values are in a reasonable range.
Unfortunately, in this way, the calibration loses a lot of accuracy and I have to greatly increase the number of steps in the MCMC analysis.

Do you know a good balance between the dimension of chains, their lenght and number of steps, in order to increase the accuracy of the analysis? If I increase everything the computational duty becomes considerable.

Thank you in advance for your time.

Bye,

Federico

nluethen · December 19, 2019, 6:32pm

Hi Federico,

I am not sure I understand what you mean with this statement:

What kind of prior distributions were you using before, and which are you using now (beta with which parameters and bounds?) Are you sure that the prior you chose actually fits to the prior knowledge you have about the weights? If you know nothing more than the bounds of the weights, you should use a uniform distribution (or equivalently, a beta distribution with r = s = 1). What do you mean with “accurate calibration”? How can it be that the calibration gets worse, when you actually put in more specific knowledge about the weights?

Regarding your second question

this probably depends on the details of your problem. Which MCMC algorithm are you using? @paulremo Do you have a suggestion?

paulremo · December 20, 2019, 6:35am

Hi Federico

I also don’t understand your statement about calibration accuracy. Do you mean that the posterior distribution you compute in your setup does not change w.r.t. the prior distribution? If this is the case, and provided the MCMC algorithm has actually converged, it is still possible that your data is non-informative or your prior is actually chosen too narrowly.

As diagnostic steps I propose you do the following:

Make sure that the prior distribution actually reflects your prior knowledge about the parameters. To do this, you can visualize it with uq_display(myPrior).
Make sure your discrepancy options are set up correctly. Did you specify them yourself or use the default options? If the latter is the case, the discrepancy variance might have been choosen too large.
Check whether your MCMC algorithm has converged. This is not a straight-forward topic, but you can have a look at Section 1.3.5 of the Bayesian module manual here to get you started. Definitely consult the traceplots of your MCMC chains by plotting them after running the MCMC algorithm with uq_display(myBayesianAnalysis,'trace','all').

Let us know how it goes!

Federico_Romis · January 11, 2020, 12:06pm

Dear Nora,
Dear Paul - Remo,

thank you very much for your time and for your answers.
I took some time in order to perform the diagnostic tests with different distributions.
I changed the discrepancy options and marginals of the distributions, checking the convergence of the MCMC algorithm.

It seems that the problem is related to the relationship used to define the expected damage scenario and not to the MCMC algorithm itself. It concentrates the values around the middle classes of my refering scale (EMS-98), not allowing to consider the boundaries, even for very low or high Iv inputs. This problem limits the calibration performed through the data of my case study, so probably I have to modify the model of the calibration and not the algorithm.

I will continue with other tests and keep you inform about the developments.

Thank you again so far, your suggestions helped me to understand better how the algorithm works.

Bye,
Federico

nluethen · January 13, 2020, 9:02am

Hi Federico,

Great that you were able to find out more about where the problem is. It is a valuable insight that the model cannot fit the data well with any choice of parameters. Hopefully you can find a more suitable model!

We wish you good luck and are looking forward to hearing whether the calibration works out in the end!

Best,
Nora and Paul