Dear @aokoro
Please find some hints for your various questions:
-
The initial experimental design is often chosen of size n_{init} = \max(2\cdot M,10), where M is the number of input parameters, i.e. twice the dimensionality of the problem, but at least 10 points.
-
For the number of clusters, see the other post here
-
Active learning involves different ingredients: the type of surrogate, the reliability solver, the enrichment criterion and a stopping criterion. So there is no single answer. Yet the point is to get results within a few hundreds of runs of the computational model, whereas any simulation method (even advanced ones, such as (sequential) importance sampling, line sampling or subset simulation) usually require \mathcal{O}(10^{4}) runs or more.
If your problem is likely to have a small failure probability <10^{-5} , you definitely need to use subset simulation together with the surrogate. -
PC-Kriging generally shows better accuracy than (ordinary) Kriging when you have small experimental designs or high dimensions (M>50).
We used PC-Kriging with subset simulation to solve the TNO blind benchmark (details here), where results are commented here. This approach allowed us to get the most efficient/accurate results (out of 9 participants) on 24 out of 27 component- and system reliability problems (this comes from the benchmark authors at TNO).
We’re currently finalizing a paper with @moustapha and @ste where another benchmark of 20 problems and 41 methods is carried out. More soon !
- Dependent variables are handled as usual through an isoprobabilistic transform to allow the construction of orthogonal polynomials.
I hope it helps
Best regards
Bruno