# Normalizing PCE based Sobol indices to input paramter scales

Hello UQlabers,

I am new to this library and really enjoying using it. I have a question regarding sobol indices derived from PCE coefficients. I’m doing a sensitivity analysis using parameters of different scales (some 0-1, some 0-1000). Should the sobol indices not be normalised in some way accomodate for this difference in parameter scales?

I have seen some examples in the library where this difference in scale is included but not addressed (uq_Example_Sensitivity_02_SobolIndices.m), so I am wondering if I am missing something. Is there some kind of scaling going on in the PCE to accomodate for this?

The indices described in the documentation is the original variance fraction formula, but Sobol (2008) proposes a normalized indices formula, which I think would address this.

Many thanks,
Henry

Saltelli A. Global Sensitivity Analysis : The Primer . Chichester, England: John Wiley; 2008. doi:10.1002/9780470725184

Dear Henry

Thanks or your question. First of all it is independent on the fact that you use polynomial chaos expansions (PCE) to estimate your Sobol’ indices: your question could also apply if you were to use any Monte Carlo estimator of the latter!

Now the answer: no, we should not normalize anything. Why?

In a nutshell, when using global sensitivity analysis, a parameter X_i is “important” if it contributes a lot to the variance of the model output. There could be two (non exclusive) reasons for this:

• the gradient of the model w.r.t. this parameter is large,
• the variability of this input parameter (i.e., \textrm{Var} [X_i]) is large.

You can check this easily by decomposing the variance of a linear model Y = \sum_{i=1}^d a_i \, X_i with \{X_i, \; i=1 \dots d\} independent:

\textrm{Var} [Y] = \sum_{i=1}^d a_i^2 \, \textrm{Var} [X_i]

and the first-order Sobol’ indices:

S_i = \frac{a_i^2\, \textrm{Var} [X_i] }{\sum_{i=1}^d a_i^2 \, \textrm{Var} [X_i] }

Back to your problem: if you have parameters in the range [0, 1] vs. [0, 1000], there is a huge difference in their variance. So if they would play “structurally” the same role in your model (similar magnitude of gradient, i.e. comparable a_i's in the linear model), the first one will have almost no importance compared to the second one.

As a by product: it is extremely important to properly quantify the sources of uncertainties in any UQ/sensitivity problem: if you choose a huge input range just because you didn’t really think about it, your results could be easily misinterpreted.