Divergent samples with a third-party software

First of all I would like to thank the team for UQlab, it is a very efficient and very useful tool.

I am working with UQlab for geotechnics, and I use UQLink with ZSoil, which is a finite element software. I mostly try to assess rare events, and I use a lot the reliability module, with heavy models.
My current project is about an article from D. Straub and I. Papaioannou : “Reliability updating in geotechnical engineering including spatial variability of soil” (2012).
I have three inputs in my model, and some combinations of inputs give back no results with ZSoil, because the calculation does not converge (the equilibrium is not satisfied, and the excavation is not stable) : the software returns an irrelevant output. When I use UQlink to quantify uncertainty, I have to sort the output which are divergent. Thus, my parser (which returns the output, displacement of some nodes) assigns an arbitrary displacement (200mm, which is quite huge here), in order to signal that this calculation diverged, and to count it as a failure. To give some figures, for 700 samples of my input, my model gives back 15 divergent points (you can see the PDF on the picture).

It creates a discontinuity in my output thus I think the way I treat those divergent cases is not the best. Indeed, I would like to set a PCE, in order to do a reliability analysis, and also a bayesian inverse problem. I am not sure that a PCE which was calculated with some divergent samples is precise, and I think this discontinuity in the output can give imprecise results.

So I guess my question is: did anybody ask this question before, and what would be your advice in order to treat these diverged values in the best way possible ?

Thanks a lot for your time and your help,

Marc Groslambert
Geomod SA

Hi @M_Groslambert,

Welcome to UQWorld! :slight_smile: Yes, you are right, the PCE might not fit very well if you force it to approximate some arbitrary huge value in the failed points.

Have you tried to simply omit the points that do not converge from the PCE fitting? How did you choose your experimental design (the 700 samples of your input)?

Maybe it is also worth it to investigate why the simulations do not converge in those 15 cases. Does it happen in the middle of your input space or at the boundaries? Maybe you should choose your input space a bit smaller to make sure you stay in the physically meaningful region?

By the way, a safer way to signal failed computations would be to assign NaNs (instead of 200mm) in the parser when a simulation did not converge.

I hope this helps, let us know how you proceed! No one else asked this question before, so it will be interesting for future readers how you decided to deal with the problem.

Thanks a lot for your answer, it is very helpful to talk about this.
The samples which do not converge are probably the ones which are of interest for the reliability analysis. I am afraid that they are part of the domain of failure, and then assigning NaNs (or omit them) will distort the PCE’s prediction in the domain. Do you concur ?

Indeed, it is mainly one input which causes divergence, and I followed your advice, by reducing this input distribution’s space.

Thanks again for your time,

Marc Groslambert

Hi @M_Groslambert,

Ah, of course, if the samples that don’t converge are part of the domain of failure, you cannot simply omit them. Are these the only samples in the domain of failure, or do also some converged ones lie in that domain?

If you omit them, the PCE prediction will simply not have any information at these points.

The problem with the approach that you described in your first post is that polynomials are varying smoothly, and the smoother the function they are trying to approximate, the better the fit. If you assign some arbitrary huge value (200) to the divergent simulations, and they are surrounded by converged simulations with small values, the PCE tries to get as close as possible to all these values, including the huge one. This means it tries to make a “spike”, which is not at all suited for polynomials, and needs high orders. Of course, this is a very hand-waving explanation, but you are making your function very non-smooth by assigning those huge values, and this will result in a bad PCE fit. Whereas for reliability analysis you want a very good fit, especially around the domain of failure.

Of course, changing the input distribution changes your problem, and if the divergent simulations are really part of the domain of failure, the answer you get will be different. You should only do this if it makes sense according to the question you want to answer.

Actually, I am wondering whether you can trust your ZSoil model around these samples that do not converge. As you probably know better than me, engineering models are only precise for a certain range of inputs, and certain approximations do not hold outside of this range, or the models are not calibrated well far away from typical conditions. This might be some more motivation to think carefully about how the input space is chosen.

Indeed some inputs are part of the domain of failure, and then if I push it too far, they diverge.
I understand what you mean about the smooth function. I tried to set a PCK, and it seems to give better results than the PCE on this problem. It might be a solution, if I assign a coherent value to divergent samples (the maximum of convergent samples, to avoid discontinuity in my output, for example) instead of 200mm (which create a discontinuous output).
As you say, changing the input also changes the problem, but it helped me better understand the behavior of those tools, and I think most of the case I want to study won’t have divergent samples.

Thanks for the relevant and detailed answer !

Marc Groslambert

That sounds like a good solution! You are welcome :slight_smile: