Confusion about equation (1.37) in the UQLab User Manual Structural Reliability (Rare Event Estimation)

Jingyu_Liu · December 8, 2022, 4:18am

Hello all,

I am really confused about the expression of MCS estimatior (Page.11, eqn (1.37)) in Structural Reliability manual (Version:UQLAB-V2.0-107):

Personally, I think it is inaccurate or incorrect

As an estimator of the true P_f, \hat{P_f} should be a random variable. According to the Central Limit Theorem, when the number of samples is large enough, the distribution of \hat{P_f} will converge to normal distribution. However, I think the mean and variance of equation given in eqn (1.37) is inaccurate or inappropriate:

For mathematical expression of normal distributions, it might be better to use the variance, instead of standard deviation?
For the mean value, I think it should theoretically be P_f, which is the true mean or true failure probability, instead of \hat{P_f}.
For the variance, I think it should be the ture variance \frac{P_f(1-P_f)}{N}. I am really confused about the reason why the standard deviation is \sqrt{\hat{P_f}(1-\hat{P_f})} (i.e., it is using the estimated value and is not divided by N). Intuitively speaking, at least the variance should decrease along with the increasing number of N.

Jingyu

xujia · January 3, 2023, 11:09am

Hi @Jingyu_Liu,

Thank you for creating the post here and for the remarks.

Both notations, \mathcal{N}(\mu,\sigma^2) and \mathcal{N}(\mu,\sigma), to represent normal distributions are commonly used.
Yes, you are right: \widehat{P}_f is an unbiased estimator of P_f, i.e., \mathbb{E}\left[\widehat{P}_f\right]=P_f.
Yes, you are right: the variance \sigma^2_{P_f} of \widehat{P}_f is \frac{P_f(1-P_f)}{N}. The reasoning between Eq. (1.27) and Eq. (1.28) should be reorganized as follows for a more rigorous mathematical presentation. The variance of a single trial (which is a Bernouilli random variable) is \sigma^2_{1_{\mathcal{D}_f}} = P_f(1-P_f). Because the trials are independent and identically distributed, the Monte Carlo estimator \widehat{P}_f follows asymptotically a Gaussian distribution, i.e., \sqrt{N}(\widehat{P}_f - P_f)/\sigma_{1_{\mathcal{D}_f}} \stackrel{d}{\rightarrow}\mathcal{N}(0,1), according to the central limit theorem. Because P_f is unknown, the standard deviation \sigma_{1_{\mathcal{D}_f}} is unknown. Since \widehat{P}_f is a consistent estimator of P_f, \widehat{\sigma}_{1_{\mathcal{D}_f}} = \sqrt{\widehat{P}_f(1-\widehat{P}_f)} is a consistent estimator of \sigma_{1_{\mathcal{D}_f}}, which follows from the continuous mapping theorem. According to Slutsky’s theorem, \sqrt{N}(\widehat{P}_f - P_f)/\widehat{\sigma}_{1_{\mathcal{D}_f}} also converges in distribution to \mathcal{N}(0,1), based on which Eq. (1.39) was derived to calculate the confidence interval. Note that a confidence interval itself is random, as it is calculated from data. Moreover, an \alpha-confidence interval indicates that the interval has the probability \alpha to contain the true value.

We will modify and update the manual in the next release. Thanks again for your feedback.

Best regards,
Xujia