Dear nluethen,
@nluethen
First of all, I would like to express my sincere appreciation to you and your team.
There is no doubt that the PCE model is a very powerful tool for modeling based on small samples.
Recently, I had a huge confusion when I was using a dataset with unknown marginal types to construct a PCE model.
Based on the finite element analysis, I obtained two different data sets as follow:
The first three columns are inputs and the fourth column is an output,and the bolded five groups are used as the validation set
dateset1
a b c lc
29.070 6.329 0.04997 4
61.220 7.504 0.04849 10
61.220 7.847 0.04984 10
97.200 9.480 0.04953 44
53.470 7.376 0.04945 8
102.800 10.200 0.05183 52
**43.070 7.009 0.05012 10**
**99.090 9.729 0.05085 50**
dateset2
a b c lc
88.560 7.725 0.05188 18
94.670 6.345 0.04388 10
82.360 4.940 0.03764 8
104.900 6.092 0.04184 56
**119.500 7.637 0.04544 78**
**102.700 7.369 0.04669 40**
**101.500 6.400 0.04187 24**
The accuracy of the PCE model constructed based on [dateset1] is as follows, but the reliability is insufficient due to the data points is too few, the PCE prediction value is very different from the validation value, although the validation error value is small enough.
Leave-one-out error: 3.5505929e-02
Modified leave-one-out error: 1.3426940e-01
Validation error: 1.3059044e-02
When I constructed the PCE using the [dateset1 and 2] , it showed a significant degree of reduction in accuracy.
Leave-one-out error: 3.3504323e-01
Modified leave-one-out error: 7.9010776e-01
Validation error: 6.1208222e-01
I tried the following four combinations to infer the marginal types of the input parameters, but the results were not good.
InputOpts.Marginals.Type = 'auto'; %kernel smoothing (ks)\auto
InputOpts.Inference.Criterion = 'BIC'; %KS\BIC
InputOpts.Marginals(1).Inference.Data = X(:,1);
InputOpts.Marginals(2).Inference.Data = X(:,2);
InputOpts.Marginals(3).Inference.Data = X(:,3);
InputHat = uq_createInput(InputOpts);
uq_print(InputHat);
I can’t figure out why the accuracy of the PCE model is greatly reduced when the sample size is increased?
This has never happened in my previous work(The accuracy of the PCE model basically improves with the increase of the sample size of the experimental design set)
From my point, I have thought about the following reasons:
- Poor selection of validation set
- The sample size is still insufficient (more sample size is needed to infer marginal types than known marginal types)
- The data itself is not suitable for constructing surrogate models
This will give me a deeper understanding of the PCE model.
Any reply I would appreciate!
Many thanks
bests