UQWorld

Connection of information gain and sensitivity indices

Dear colleagues,

I am a UQ guy. I am using information gain a lot in UQ, e.g. to understand locations of the most informative measurements, or to measure information gain when I replace a prior by a posterior.
I also studied from Oliver Le maitre and from Bruno how to compute sensitivity indices from variances.

Today I found something curious: I cite
" Definition: Information gain (IG) measures how much “information” a feature gives us about the class." See more here


This sentence remind me “Sensitivity of uncertain_parameter_1 is its contribution to the total variance (total uncertainty)”.

I found that information gain and sensitivity indices are connected. What do you think? Agree with me?

More references
https://www3.nd.edu/~rjohns15/cse40647.sp14/www/content/lectures/23%20-%20Decision%20Trees%202.pdf (IG in decision trees)

https://medium.com/coinmonks/what-is-entropy-and-why-information-gain-is-matter-4e85d46d2f01 (What is Entropy and why Information gain matter in Decision Trees?)

Hi Alexander,
I am not an expert on practices based on Information Gain (IG) myself, however, I think that your intuition regarding the connection between IG and variance-based sensitivity (Sobol) indices is further supported by the 1st reference, where it is stated that “Unrelated features should give no information”. This is similar to parameters with zero or very small Sobol indices, which typically do not affect the QoI.

Since this reply comes quite a long time since your initial post, have you investigated this connection more? For example., if you try to order a model’s input parameters/RVs based on their IG scores and based on their Sobol indices, do both metrics lead to the same ordering?