Structural reliability and building collapse in UQLab

voulpiotis · July 2, 2020, 7:21am

Hello UQ community!

A nerdy UQ topic on the reliability of a structure subject to collapse: I have successfully built a PCE metamodel of a complex finite element model in Abaqus that incrementally studies the collapse of a building using a nonlinear static Riks solver. For the time being, I keep the dimensionality low (3 input variables, all others constant), so I get an acceptable LOO error with not too many runs using LARS.

My knowledge starts to break down in the model outputs: if I simply study a single value (say, the deflection of one node), then everything is fine, I can get the mean, variance, run Monte Carlo to get the PDF, etc. But what happens if I want to study many outputs, such as all the deflections in a set of nodes plus all the rotations in a different set of nodes? I haven’t figured out how to script my model output to achieve this.

To make things more complex, I am building a post-processing script that uses these outputs to figure out the collapse scenarios and their respective loss of building area. So the outputs here (collapse scenarios) become discrete. If, for instance, I ran my full Abaqus model a million times (impossible time-wise), the outputs could look like that:

700,000 model runs (P = 0.7) without collapse, failure area = 0sqm
250,000 model runs (P = 0.25) with collapse scenario 1, failure area = 50sqm
40,000 model runs (P = 0.04) with collapse scenario 2, failure area = 23sqm
10,000 model runs (P = 0.01) with collapse scenario 3, failure area = 140sqm

Is there any way to achieve such an output by using a PCE metamodel and running Monte Carlo on it? And if yes, how will the metamodel be able to predict a collapse scenario that was not identified in the experimental sample? If not, could you advise on alternatives to get as close as possible to the above?

I understand this is a number of questions on PCE. Any help on any of them would be truly, greatly appreciated. Thank you in advance!

Konstantinos

xujia · July 3, 2020, 10:16am

Hi @voulpiotis,

Thanks for creating this discussion.
If I understand correctly, there are two goals:

Creating surrogate models for a large number of outputs. In other words, your output (as a vector \boldsymbol{y}) has high dimensionality \boldsymbol{y} = (y_1,y_2,\ldots,y_M) (M is the number of outputs that you consider)
Using the output vector to determine the collapse scenario (post-processing \boldsymbol{y}). Possibly, create a surrogate that directly predicts the collapse scenario without going through the first step.

I suppose the whole process is deterministic: meaning that a given set of input parameters can uniquely determine the output, and the output can provide the corresponding collapse scenario.

For the first goal, you can create a PCE for each component of the output, if each component itself is of interest. To this end, you do not need to use a loop to go through each scalar y_i. Instead, you can organize your model response as a matrix \boldsymbol{R} of size N \times M, where N denotes the number of model runs. Hence, R_{i,j} corresponds to the results of the j^{th} component of \boldsymbol{Y} at the i^{th} run. Then, you can write the following comments in UQLab:

MetaOptsMean.ExpDesign.Y = R;

UQLab will built a PCE for each component of \boldsymbol{Y}. You can also have a look at the function uq_Example_PCE_04_MultipleOutputs.m.

If some patterns in the output are helpful (or the dimensionality of the output vector is extremely high), you can post-process the output to calculate the associated values (on a basis) and then build a PCE for these quantities. For example in structural dynamics, one can use the modes to represent the structural displacements, and thus the projection of the model response can be reduced to a few coefficients of the modes. If no physics can help, you can also use some methods, such as principal component analysis, to compress the output.

For the second goal, you can directly use the output vectors represented by a series of PCEs to run Monte Carlo simulations. However, may I ask how do you classify an output vector to a collapse scenario (and the collapse area seems to be continuous rather than discrete)? If this is determined by checking a few continuous values as functions of the output vectors, I would suggest directly build a surrogate on these values instead of emulating the nodal displacements as an intermediate stage. Alternatively, you can also use support vector machines and multinomial logistic regressions to directly build a surrogate model for classifications. Support vector machines are available in UQLab for binary classification (so you can use a one-vs-all strategy).

voulpiotis · July 3, 2020, 2:31pm

@xujia thank you very much for your detailed answer and introduction to many new exciting aspects of metamodelling! I will try to answer as accurately as I can despite my limited knowledge:

You understood my goals well. The first one (output matrix instead of vector) seems easy to implement, so I will attempt this in UQLab.

If some patterns in the output are helpful (or the dimensionality of the output vector is extremely high), you can post-process the output to calculate the associated values (on a basis) and then build a PCE for these quantities.

By this, I suppose you mean that I should reduce as much as I can the output dimensionality (for computational efficiency?). I analyse a tall building in vertical collapse, so from the stupidly large deformation vector I obtain in the FE model I could narrow down to a few hundreds of outputs (say, z-axis deformations at all the beam midspans for each bay of the building). Is this too much you think? I plan to analyse multiple damage scenarios and multiple buildings, so I am interested in the fastest surrogation possible.

However my ultimate goal is the second point, so yes ideally I would create a surrogate that directly predicts the collapse scenario(s). I have thought of one way to classify deformations to collapse scenarios: a 3D matrix Aijk representing the 3D geometry, where i = x-bay column, j = y-bay column and k = storey number. So A111 would be the ground floor column at grid position (x1,y1), A121 would be the ground floor column at grid position (x1,y2), and so on. A value of zero would mean there is no collapse there (determined by the z-axis deformation amount), while a value of one would mean there is collapse. In a simplified, coarse way, I end up with matrices of zeros and ones, each unique one corresponding to a different collapse scenario. The area of collapse is simply calculated by multiplying this matrix by the bay size and taking the sum of all elements.

So yes, there can be a direct relationship between the output deformations matrix (selected few hundred values) and the unique collapse mechanisms. The problem is however that the raw output (deformations matrix) will be very highly nonlinear: imagine following the deformation of the tip of a cantilever for different (probabilistic) load and material strength inputs. While the cantilever doesn’t yield or break, the deflection is a nice continuous variable. But if, for certain combinations of the inputs, the cantilever yields or breaks, I will get a sudden step change in the output. Now the same applies for my tall building, but a lot more complex: minute fluctuations of the inputs may give rise to wildly different collapse scenarios.

For the second goal, you can directly use the output vectors represented by a series of PCEs to run Monte Carlo simulations. However, may I ask how do you classify an output vector to a collapse scenario (and the collapse area seems to be continuous rather than discrete)? If this is determined by checking a few continuous values as functions of the output vectors, I would suggest directly build a surrogate on these values instead of emulating the nodal displacements as an intermediate stage.

So while I understand you here, I get stuck in the fact that I am checking a few values indeed, but they are not exactly continuous. I checked the new things you mentioned and multinomial logistic regression seems the most appropriate, although my knowledge it beyond simple statistic programs is very limited. Do I understand correctly that I need many model runs to do such a regression? (or to train a program to do it?). Unfortunately, my model is expensive (in the order of minutes per run).

I hope the detailed info clarifies things a bit more. Do you have any further advice? Should I be looking at the SVMs since they are implemented in UQLab? (binary result would be {type x collapse; no collapse}. I suppose I can find resources for a one-vs-all procedure in Matlab or Python).

Thank you again,

Konstantinos

xujia · July 3, 2020, 3:17pm

Hi @voulpiotis,

Thanks for the clarification. I would like to make some further remarks based on your explanation.

If I understand correctly, this discontinuous problem appears in your model output (so the deformation matrix). This implies that direct surrogate may not give very accurate results. For discontinuous output (but continuous within disjoint regions), @moustapha has much more experience, and I think he can give valuable suggestions.

Actually, you have a lot of outputs instead of input variables. So for each single component prediction, you do not need a lot of model runs. A potential improvement may exist if there is an underlying structure in the output, e.g. \boldsymbol{y} = \sum_{k=1}^{K} a_k \boldsymbol{y}_k, where \boldsymbol{y}_k's are vectors of the same dimension of \boldsymbol{y} (or modes in structure dynamics). If K<<M, building surrogates for a_k's can not only reduce the number of outputs but also preserve the output structure.

In fact, the one-vs-all strategy contains several binary classifiers: {no collapse ; collapses}, {type 1 collapse; no collapse or other type collapses}, {type 2 collapse; no collapse or other type collapses} etc.

voulpiotis · July 3, 2020, 3:34pm

Thank you @xujia! One clarification question here:

Do you mean that using the intermediate step of the multiple PCEs for the outputs is able to work? You have now spiked my curiosity. I hope @moustapha wants to play this UQ game!

Best,

Konstantinos

voulpiotis · July 3, 2020, 4:02pm

I just read this paper from @moustapha and it looks very much like what I need, with the difference that once I identify the class (=collapse scenario), I don’t really care about the accuracy of the output within that class, but rather the accuracy of predicting the occurence of the class only. I can start with a simple problem that is in fact binary: the most common collapse scenario vs no collapse. Then I could look into one-vs-all I guess.

xujia · July 4, 2020, 12:51pm

Dear @voulpiotis,

Yes, that’s the paper I refer to. By the way, I think you can try three approaches and pick the best one: 1. building surrogates for many outputs (but are continous with respect to the input) and then post-processing the results to get the collapse type; 2. applying the method introduced in @moustapha’s paper; 3. using SVM for classifications.

moustapha · July 4, 2020, 2:57pm

Hi @voulpiotis and @xujia,

Sorry to jump in so late in the discussion By now, I see by your exchange that you have pretty much figured it out.

I just read this paper from @moustapha and it looks very much like what I need, with the difference that once I identify the class (=collapse scenario), I don’t really care about the accuracy of the output

If you are only interested in knowing the state of the system (collapse/no collapse) then I would say doing just classification is enough. As Xujia suggested you can use the support vector classifier implemented in UQLab. In your experimental design for a given input the output would simply read -1 or 1. You can then do binary classification. Now, if you have multiple failure scenarios and you would like to discriminate among them, you won’t be able to do it directly with UQLab as only binary classification is implemented. But again as Xujia suggested you can simply set up a one-vs-all strategy and decide the class by majority vote. Alternatively, you can have a look in classifiers already implemented in Matlab that would allow you to do multi-class classification. I would suggest using any variant of the classification trees to start with.

Regarding our work in the chair mentioned above, it would indeed only be useful if you would like to know more than just the state of the system, e.g., the area of collapse. Then a two-stage approach could be used to first identify the failure scenario and then within the scenario estimate any meaningful QoI.

Please keep us updated. I am curious to see the results you get and also wondering if our two-stage approach could be used with your model

Cheers,
Moustapha

voulpiotis · July 4, 2020, 3:07pm

Thanks @moustapha! I will try the SVM method to figure out the binary {collapse; no collapse} result in UQLab. I need to look into Matlab classifiers, I have no idea how to do this. But shouldn’t the {collapse} outputs differ for different collapse scenarios?

Regarding the second step: in a simplified world, I assume that each collapse scenario corresponds to a unique collapse area (deterministic). So no variability to study there. The disjointed continuous output is only in the deformations, which will be postprocessed to define the collapse scenarios. Do I understand then this correctly: I will need only one step to do the classification using the disjointed continuous output (let’s start with one binary case first). A second step is not needed.

Konstantinos

voulpiotis · July 15, 2020, 1:50am

@xujia a quick followup on your comment about multiple output PCE: is it possible to run a Sobol’ analysis when you have multiple PCEs? I tried to implement it in UQLab without success. Do I need to do a for loop and extract them manually for each PCE and calculate an average?

Thanks again,

Konstantinos

xujia · July 15, 2020, 7:54am

Hi @voulpiotis,

It depends on how we define “multiple PCEs”.

If we have a single PCE object but with multiple outputs MyPCE, which is created by emulating a physical model with the output being a vector (as we discussed before), we can directly use UQLab by something like
```
sensOpts.Type = 'Sensitivity';
sensOpts.Method = 'Sobol';
sensOpts.Model = myPCE;
mySens = uq_createAnalysis(sensOpts);
```
If we have multiple PCE objects, I think a loop is necessary. Importantly, you should specify the model in the sensitivity option sensOpts.Model = myithPCE so that UQLab will take the right model to study.

I hope it is clear.

IHM · November 3, 2020, 9:21pm

@voulpiotis Hello, I have a job similar to yours. in fact I have to build a PCE metamodel of a finite element model but on a software (CASTEM) then to do the sensitivity analysis of certain parameter. I’ve never done it and I don’t know how to go about it. Please can you help me with this?

voulpiotis · November 4, 2020, 3:50pm

Hi,
I’m not the expert here, but I can highly recommend you download UQLab and read the manual for PCE and sensitivity analysis, it is very well written.
Konstantinos

voulpiotis · November 4, 2020, 4:04pm

@xujia @moustapha regarding the SVC classification: to implement the one vs all strategy, I create and run one SVC model for each binary classification, then I take the probabilities of the class of interest from each one and combine them to get my final result. Is is possible to shortcut by creating all the SVC models at once by providing an N x M matrix for the experimental design output, where N is the sample size and M are the columns for the different classes? In the same way that I can create many PCEs by giving UQLab a matrix output rather than a vector. I tried it and it doesn’t seem to work.

Thanks,

Konstantinos

damarginal · November 4, 2020, 8:48pm

Hi @IHM,

Welcome to UQWorld!

I assume you’ve downloaded UQLab already. If so, having a look at the manuals suggested by @voulpiotis might help.

But maybe I can ask whether you’ve been able to establish a connection between your software (CASTEM) and UQLab. If you haven’t, how would plan to establish it? Is there a programming interface between CASTEM and MATLAB? Alternatively, there’s a UQLab module called UQLink that might do the job. Have a look at its user manual and some of the examples.

Should you have more specific questions about it, please don’t hesitate to open a new topic in this category, we might be able to help you better . From your question, I guess you’d need to understand some parts of three different modules of UQLab (UQLink, PCE, and sensitivity), so let us know if you need help.

PS: I’d imagine this particular topic might be different or would go to a different direction than your original question

IHM · November 4, 2020, 9:24pm

HI @damarginal,
Thank you for answering me. I already have Uqlab but I don’t know how to link it with CASTEM. On the other hand, I know how to link castem and matlab. as a suggestion I will read the used manuals to better understand

IHM · November 4, 2020, 9:25pm

OK @voulpiotis. thanks for your responce

moustapha · November 5, 2020, 11:17am

Hi @voulpiotis,

It should work exactly the same as with the other metamodels.
I realized now we did not publish an example with multiple outputs for SVC. We however have an example that is not in the release, please find it in the attachment.

uq_Example_SVC_04_MultipleOutputs.m (2.7 KB)
uq_test_class_multi_outputs.m (212 Bytes)