Sobol indeces with many outputs

Pachita · December 14, 2022, 2:08pm

Hallo everyone!
I have a problem characterized by 4 random variables (model inputs) and 80 observed features (model outputs). I would like to figure out which input has a higher chance to be correctly identified in a Bayesian inverse problem, in other words which input has the higher effect on the outputs, so that an observation of these output can be effectively used to identify input. I have calculated the partial variances and the Sobol indeces, however I have difficulties in interpreting the results, given the high number of observed features. Any suggestion on how to proceed in this kind of cases?
Thanks a lot in advance
Best regards
PM

giansteve · December 16, 2022, 11:48am

Hello @Pachita

I have some questions for you . Could you also post the code and maybe the results you are talking about? Are you using uqlab for this problem? In which sense can you not interpret the results?

As a first suggestion, since you have a high number of outputs, I would suggest identifying which, among the 80 outputs, are the ones you are interested in. Are these 80 outputs data points in time, or are they 80 different features of your model?

Pachita · December 16, 2022, 12:39pm

Hi! Thanks a lot for the quick reply and the interest in the topic.
Currently, for this specific application, I am not using UQLab, although I have used it in the past.
I can describe better the problem: the input are 4 material mechanical properties, while the output are vibration dynamic parameters (mode shapes and frequencies). I want to identify the inputs through output observations by applying the Bayesian Updating. Sometimes I have difficulties in identifying some input parameters, and my explanation is that the outputs are “less sensitive” to these inputs. I have computed the Sobol indeces of the inputs with respect each output. I would like to draw some general conclusion: Which is the input from which the outputs are overall more sensitive?
I hope that this explanation can help.

Thanks a lot
PM

giansteve · December 16, 2022, 5:36pm

Hi @Pachita!
I tried to find a solution, and I wrote down some thoughts that I hope may be helpful. However, something in this problem suggests to me that the assumption of having the sensitivity indices supporting the Bayesian framework requires further investigation.

I am not familiar with the Bayesian Updating technique. From what I could understand, this technique will evaluate the probability of the hypothesis (your input) when new evidence (your output) comes to light. You assume that the sensitivity indices of your input can help you understand how to better identify your input, given your output. Am I right?

Let me recap what the indices can communicate. The Sobol sensitivity indices will tell you how much is the influence of your input on the computed output when the input varies in a given probabilistic framework (uniformly or normally distributed, among some). So if an input variable (say variable a) has a high first-order Sobol index (S_a^{(1)}\approx1), it means that the output y is highly sensitive to the variation of a. If, on the other hand, S_a^{(1)}=0, then the variation of a does not lead to any variation of the output y. Furthermore, the total-order Sobol index (S_a^{(\mathrm{T})}) also tells you about the level of interaction of the random variable a with the other variables that lead to the variation of the output y.

In your problem, I would check which of the 4 variables has a S_i^{(\mathrm{T})}\approx0 to delete them from the Bayesian Updating technique. However, although strictly connected to the nature of the model, I doubt that one of the variables may have this condition verified. This could be especially true since the model has only 4 random variables and also because the sensitivity is highly dependent on the probabilistic distribution of the random variables.

Also, since you have so many outputs (80), I would find a way to group or reduce them somehow. This always helps reduce computational costs.

What troubles me in this work is that, when S_a^{(\mathrm{T})}\rightarrow0, with S_a^{(\mathrm{T})}\neq0, then the variable a can be assumed (following the Factor Fixing setting) to be a model constant since its contribution is neglectable, and not absent. However, since its contribution is not exactly 0, there is a value (or a set of values) of a that produce a given output y^*. Therefore, the probability of the hypothesis a for the evidence output y^* is nor zero or approaching zero. In conclusion, I do not know how the sensitivity indices can help in the Bayesian framework. Note also that I am not an expert in Bayesian Updating. Therefore, I may miss some important information somewhere. I would love to hear your opinion about this

I hope it is not too long to read this topic caught me up
Best
Gian