Sensitivity Analysis in UQLab

Vishashalu · September 3, 2019, 10:17pm

Hi,
I am using the Sobol sensitivity analysis for my model. I was able to use them well with my previous simplified model and currently with much more complex model and more input uncertainties, I have been getting output of the indices especially the total sobol indices more than 1.2 to 1.9 and the sobol first order are all negative values. I am not sure if I understand the issue. Is it might be that I have to implement the dependent input SA types? If my understanding on the dependent inputs is correct, my input variables that is uncertain is not dependent and hence I am not sure. Any comment on this? If I need to try the dependent input SA, any suggestion to which type?

damarginal · September 4, 2019, 10:34am

Hi,

Perhaps you could describe your problem more? How much more complex does your model actually become? With how many more inputs? What was the result of the simpler model? Did you model your inputs with a dependence structure, say using a Gaussian copula?

And when it comes to computation: How do you compute the Sobol’ indices? Is it via a Monte Carlo simulation? If so, with how many sample points? In some cases, problems (such as negative but very small values for the first-order Sobol’ indices) might happen due to convergence issues.

Moreover, it might be indeed the case that your inputs would be realistically dependent, but if you don’t model any dependence in your inputs (that is, you just assume that they are independent), then there should be no problem in computing the Sobol’ indices (at least numerically). It’s just that the conclusion of the sensitivity analysis might be wrong.

I might be wrong, but I am not sure if your problem here is because you don’t model a dependence structure in your inputs.

Vishashalu · September 4, 2019, 11:48am

Hi Damar,
I was running simple isothermal column model with only 5 input uncertainties. Now my model is not isothermal and has energy balance and the solution to the model will be by solving the mass and energy balance simultaneously. Currently for this model, I have 8 uncertainty inputs. I did not use any dependence structure like in my previous case.
For the Sobol indices computation, I used Monte Carlo simulation. I usually test run for 100 points to test it out but I understood that I may need to run at more sampling points to get accurate result and it might be the case. I am not sure.

I have assume they are independent to each other. Each input variable uncertainty propagate through submodels that is used to compute the mass and energy balances. So each submodels, could have two input UC playing role for example or even 3. So I dont have the feeling about it whether these interactions also play some role.

I am going to try to run with more sampling points as I did previously with 10000 samples but as the model is complex its going to take time. However I know there is this PCE method to be applied and it will much quicker. I will see to try them then.

damarginal · September 4, 2019, 12:41pm

I think 100 points are a bit too few to estimate the indices via MC. I would run with much more than that and see if it gives more reasonable results. Maybe also check the convergence of the estimates (whether the estimates are stable with increasing number of sample points).

If computational load is a concern then for sure try using metamodels (say PCE or LRA that can give you Sobol’ indices as by-products).

Hopefully, that’s what (global) sensitivity analysis of a complex model should help you find out . Now, whether the inputs are dependent or independent is one of the first assumption we make before doing a sensitivity analysis; so the conclusion of the analysis would depend on that assumption.

damarginal · September 16, 2019, 3:30pm

Hi @Vishashalu, were you able to fix your problem? Does a large number of sample still gives strange results? Were you able to use metamodels to compute the Sobol’ indices?

Vishashalu · October 1, 2019, 8:22am

Hi!
No I was not successful in solving my problem. After running 10000samples, the results for Total sobol had more than value of 1 for some parameters and others in the range of 0.8 to 0.9 and for the first order, I had some negative values.

Also, I was not successful in implementing PCE based method for my column model. There is some error perhaps with the numerical method of ODE’s and array issue which I am not able to solve. The model is quite complex and the best I feel is definitely to implement surrogate model based method but yes I am little stuck.

Any advice?

best,
Visha

damarginal · October 1, 2019, 6:49pm

Would it be possible for you to create a reproducible example here? Say by posting or attaching your model and analysis scripts? Thanks!

damarginal · October 2, 2019, 8:14am

Hi @Vishashalu, by the way, have you checked the distribution of the output of interests of this 10000 runs? I think it would be a nice diagnostic step.

Vishashalu · October 8, 2019, 8:50am

Hi Damar,

I will do that. Just give me some time. Instead of this complex model that I have, I would like to implement in a simple model to see if it works. Otherwise, I might be doing something wrong and I will get back to you on this. Thanks for your help!

Vishashalu · January 13, 2020, 10:41am

Hi Damar,
In replying to the previous post on this topic, unfortunately, even with large number of sample runs, I ran into the same results. However, i have not implemented with the simple model which I will do it this week and will keep you posted on that.

I have some fundamental questions regarding this global SA method in general. I have wrote an email to you but I remembered that I need to use this forum.

Based on the SA manual of UQLab, sampling based methods of input/output correlation and standard regression coefficients method (simple R2 coefficient of determination method I assume here) are known to be global SA method as well since its based on post processing of Monte Carlo simulation that is taking account of entire parameter space of the input variable, isn’t that correct? However these methods are not appropriate for highly non linear model and does not take account the dependency between the variables. Because these methods is categorized differently than global methods in the user manual.

In the manual, it is further categorized specifically global methods section and listed different methods such as Morris, Borgonovo and etc. And I would like to be sure that my understanding about the global SA is right or wrong? Its more about the fundamental of it.

Thank you for clarifying this point for me.

Visha

damarginal · January 13, 2020, 5:24pm

Hi @Vishashalu,

There are many ways to classify different sensitivity analysis methods; the one adopted by the UQLab user manual is just one way to do it.

As you said, sample-based methods listed in the user manual are also global methods in the sense of the considered input parameter space. The manual also emphasizes the fact that these sample-based methods only require a Monte Carlo (MC) sample of inputs/outputs to compute the corresponding sensitivity measure. If you have an existing sample, you can always compute the sample-based sensitivity measures to get an idea of the sensitivity regardless of how such a sample is generated^[1].

Now, I think their distinction from the global methods in the manual is less about whether sample-based methods are suitable for highly non-linear models or models with dependency between input variables. For instance, the model might be non-linear but as long as it is monotonic, the rank-transformed version of the sample-based methods (e.g., Spearman’s rank correlation instead of Pearson correlation coefficient) still gives meaningful results about the model’s sensitivity. Furthermore, input/output correlation methods (but not SRC) also work for dependent input variables; on the other hand, if the input variables are statistically dependent then Sobol’ indices (a global method per user manual) won’t give meaningful results.

So, maybe it’s easier to think of the sample-based methods as a subclass of the global methods^[2]. A method that belongs to this subclass only requires a set of MC sample of inputs/outputs to compute the corresponding sensitivity measure^[3].

You might want to take a closer look at the underlying assumptions about the inputs or the model that underlie each method (inputs dependency, model linearity, monotonicity, with or without interaction, etc.) so you can have a better overall picture of the available methods. Whether a method is global or not is just one way to coarsely distinguish different sensitivity methods.

I hope this answer makes things a bit clearer for you.

Finally,

Have you checked the output distribution based on your previous runs? I’m also curious about what kind of error you encountered when you attempted to construct a PCE metamodel for your model.

Of course, one should not derive overarching conclusions from this kind of analysis before checking whether the underlying assumptions of the methods stand. ↩︎
Indeed it is according to Iooss and Lemaître (2015). ↩︎
Moreover, there is a new sample-based estimator for Kucherenko indices (classified as a global method) in UQLab that is computed based on an existing MC sample. ↩︎

Vishashalu · January 13, 2020, 9:19pm

Hi Damar

Thank you very much for your detail explanation. Now I have better understanding of these methods, not all the methods in detail but at least the generality part of it.

I will keep you posted on the application of surrogate model for my model. I am going to try again and see the errors and perhaps ask you again. I will keep in touch about my problems. Thanks Damar

Best regards

Visha

Vishashalu · January 14, 2020, 9:14am

Hi Damar,
Add on question on the SA methods. Regarding the dependency of input random variables. Although some of the methods of global SA does not account the interdependency of the variables, however by specifying the dependencies structure through coupulas based on the User Manual Input of UQLab, means that the dependency of the variables are taken into account, am I right? Even if that is for the simple case of sampling based Monte Carlo simulation? Do I this understanding right?

Also, when we use the Kucherenko or ANCOVA method that is said used for dependent input variables. Can we simply use this method without the dependence structure? With regards to the dependency between input variables. If for e.g. I have uncertain input variables, lets say properties of density and viscosity which is calculated using two different correlations that the computations does not depend on each other however, these properties are used together in computing other properties. Does that also mean dependent variables?

I have been little confuse over these as well. Thank you for clarifying these to me.

Best regards,
Visha

damarginal · January 14, 2020, 5:49pm

Okay, it seems there are three questions here. Let’s go through each of them:

Perhaps you should be more specific what does it mean “are taken into account” here. In the present context, specifying dependencies structure through copula in the input variables would take into account the dependency between those variables. Generating a sample from the resulting joint distribution would result in a dependent sample.

But you should be careful about applying any global SA methods with dependent inputs. Some global SA methods are derived by assuming that the inputs are independent and the resulting sensitivity measures are not interpretable otherwise. Well, there’s no stopping you to compute the sensitivity measure of any SA methods with dependent inputs, it’s just that the results won’t be valid nor make sense if one of the underlying assumptions of the method is inputs independence. And indeed, UQLab won’t stop you to compute such a case (e.g., Sobol’ indices with dependent inputs), it will just give you a warning.

Yes. If I understand these methods correctly, they are generalization of the Sobol’ indices. Without dependency in the inputs, the indices given by either of these methods should be the same as the Sobol’ indices.

You can check this by modifying the dependence structure on the sensitivity example with dependent inputs; changing the copula type to 'Independent' will remove the dependence structure. You can then compare the resulting Kucherenko and ANCOVA indices with the Sobol’ indices. I think they should be very close. Note that there is only one type of ANCOVA indices (i.e., the first-order one) produced due to the reason explained in the user manual.

I’m not sure I understand your third question; can you clarify what do you mean by “two different correlations” for the viscosity and density? Is it some kind of experimental correlations (with some parameters) that define them? Or do you mean something else?

Vishashalu · January 15, 2020, 2:22pm

damarginal:

Perhaps you should be more specific what does it mean “are taken into account” here. In the present context, specifying dependencies structure through copula in the input variables would take into account the dependency between those variables. Generating a sample from the resulting joint distribution would result in a dependent sample.
Continuing from your previous explanations and this explanation right here, you mentioned if input variables are statistically dependent, Sobol indices wont give meaningful results. My understanding is that, if for e.g. you have your model equations, and you have input variables that you know dependent on each other. To calculate X1 you need X2 for e.g. (if I got that right), you need to built this dependence structure via copula or not? Or is that, when I perform the MC simulation without specifying the dependence structure, it is still ok because UQLab will consider the dependency without the representation of dependencies through copula?
So if this is my understanding, then I would say, if I know there is dependency of variables, I represent the dependencies through copulas in UQLab and when I use the Sobol method, it gives meaningful results of the sensitivity?

But you should be careful about applying any global SA methods with dependent inputs. Some global SA methods are derived by assuming that the inputs are independent and the resulting sensitivity measures are not interpretable otherwise. Well, there’s no stopping you to compute the sensitivity measure of any SA methods with dependent inputs, it’s just that the results won’t be valid nor make sense if one of the underlying assumptions of the method is inputs independence. And indeed, UQLab won’t stop you to compute such a case (e.g., Sobol’ indices with dependent inputs), it will just give you a warning.

Perhaps my first question above follows with the explanation of the above paragraph from you. I hope you understand my question and see my point? Basically, I am saying if you include the dependency structure for the dependent variables and some input variables are independent for e.g., you can still use Sobol indices to get the sensitivity result.
So when you say if one of the underlying assumptions of the method is input independence, how do I check this or know this. I am trying to understand the fundamentals here before trying these methods for my cases. I have used Sobol indices, but if you remember my last year post, I had results with negative and above 1, like -1.25 and above as sobol indices for some variables and I do not understand why. I will check again your comment about this particular problem and send a reply on that but you can answer to me the fundamental question first. Thank you for your time!
Yes. If I understand these methods correctly, they are generalization of the Sobol’ indices. Without dependency in the inputs, the indices given by either of these methods should be the same as the Sobol’ indices.

You can check this by modifying the dependence structure on the sensitivity example with dependent inputs; changing the copula type to 'Independent' will remove the dependence structure. You can then compare the resulting Kucherenko and ANCOVA indices with the Sobol’ indices. I think they should be very close. Note that there is only one type of ANCOVA indices (i.e., the first-order one) produced due to the reason explained in the user manual.
I think similar respond as above. Yes, I will try these examples. I might understand better.

I’m not sure I understand your third question; can you clarify what do you mean by “two different correlations” for the viscosity and density? Is it some kind of experimental correlations (with some parameters) that define them? Or do you mean something else?
So I have two different experimental correlations to compute density and viscosity with parameters. e.g. lets say I have defined for my case as in density model correlation is 10 uncertain and viscosity is 20 uncertain. But this is a clear definition of independent variable right as the density and viscosity computation does not depend on each other? But if they are being used in computation of other physical property correlations, how is this taken account? This leads to my confusion when you were saying I need to understand the underlying assumptions, with regards to dependent variables, with or without interaction and many articles have recently publishes on correlated parameters. Is this all the same? I have further explanation on some result of sensitivity indices ranking is different when I use different property correlations but I can open up a new topic?

Vishashalu · January 15, 2020, 2:24pm

I see very funny wordings there in my reply. I meant density having 10 percent uncertainty in the model correlation and viscosity having 20 percent uncertainty.

Vishashalu · January 15, 2020, 4:47pm

Continuing from your previous explanations (previous post) and the explanation right above here, you mentioned that if the input variables are statistically dependent, Sobol indices wont give meaningful results. My understanding is that, if for e.g. you have your model equations, and you have input variables that you know dependent on each other. To calculate X1 you need X2 for e.g. (if I got that right), you need to built this dependence structure via copula or not? Or is that, when I perform the MC simulation without specifying the dependence structure, it is still ok because UQLab will consider the dependency without the representation of dependencies through copula?
So my understanding is that if I know there is dependency of variables, I represent the dependencies through copulas in UQLab and when I use the Sobol method, it should gives meaningful results of the sensitivity or not? So when you say basically statistically dependent, does that mean dependence structure is accounted for the variables? I hope you understand what I am questioning here.

Continuing my question from above, basically, I am saying if you include the dependency structure for the dependent variables and some input variables are independent for e.g., you can still use Sobol indices to get the sensitivity result.
So when you say if one of the underlying assumptions of the method is input independence, how do I check for this or know this? I am trying to understand the fundamentals here before trying these methods for my cases.
I have used Sobol indices, but if you remember my last year post (maybe not), I had results with negative and above 1, like -1.25 and above for the indices and I do not understand why. I ran with even 1000 samples and yet the results were the same. I will check again your comment about this particular problem and send a reply on that. Thank you so much for your time!

I think similar questions and responds as above. Yes, I will try these examples. I might understand better.

So I have two different experimental correlations to compute density and viscosity with parameters. e.g. lets say I have defined for my case as in density model correlation is
10percent uncertain and viscosity is 20 percent uncertain. But this is a clear definition of independent variable right as the density and viscosity computation does not depend on each other? But if they are being used together in computation of other physical property correlations, does that indicates any correlations there? This leads to my confusion when you were saying I need to understand the underlying assumptions, with regards to dependent variables, with or without interaction. Many articles have recently published on correlated parameters. Is this all the same when you say dependent, correlated and interactions? I would like to discuss further on some results of sensitivity indices ranking which are different when I use different property correlations but I can open up a new topic? I hope I am not confusing you.

Best regards,
Visha

damarginal · January 15, 2020, 5:50pm

Hi @Vishashalu,

In your illustration, are you saying the model is a function of X1 and X2 and there is a deterministic formula that relates X1 to X2? If so, then why would you consider X1 as an input to your model? But if you did, what kind of marginals you assign to that variable in UQLab?

Coming back to your example, you have an uncertain input that is used to compute the viscosity and another to compute the density, then yes, both are independent. And no, how they are used inside your model is (say, to compute a third property, which is not your inputs in any case) is not what we refer to as being correlated. These two parameters might, based on sensitivity analysis, be (either strongly or weakly) interacting.

While dependent and correlated are often used interchangeably to describe the probabilistic inputs in the context of sensitivity analysis, the notion of interaction is something else. You can think of it like this: being correlated (or dependent) is solely a property of your inputs, while interaction is also a property of your model. If two inputs are interacting, then the effect of one input to the model output depends on the other input. Even if the inputs are independent, they might be interacting depending on the computational model.

Indeed it can be daunting with all these terms, but I would like to encourage you to read some introductory materials on the topic; I find the book by Saltelli et al. is a good starting point (at least Chapter 1). Also perhaps before delving into your actual computational model, it might be a good idea to play around with the sensitivity examples shipped with UQLab (especially the ones related to Sobol’).

Coming back to your original problem, how did you model your inputs? Maybe you can attach that part of your code here just for me to check?

And regarding the last point, yes, please open a new topic as I think it’s already a different topic .

Hope this makes things clearer for you.

PS:
Thanks a lot for reformatting your post!

Vishashalu · January 16, 2020, 8:41am

Hi Damar,
Thanks a lot for the quick respond.This is what I will do. I am going to look at the examples again, try to get the understanding of it, so far I have just implemented Sobol’ indices method. But I will work out the examples of other methods and try to see the comparison. Thanks for the book suggestion. It is the famous global SA book that many refers to in the articles that I know of.
I will however in the meantime, will open up a new topic on the last discussion that I wanted to ask you soon.
Visha

FANFAN · June 9, 2020, 10:37am

Hello, can I get the convergence curve when using MC, PCE and LRA in Sobol sensitivity analysis? I read the manual and didn’t find it?