The inferred marginal type 'ks' and 'Rayleigh' cannot be used in the probabilistic prior model?

felix · June 8, 2022, 7:12pm

Dear UQ-community,
@xujia
Thank you very much for your outstanding contribution!
Recently, I ran into the following problem when inferring the type of marginal distribution for an unknown data set.

Index | Name | Type    |  Parameters                                                     | Moments              
-----------------------------------------------------------------------------------------------------------------
1     | X1   | ks      |  2.907e+01, 6.122e+01, 6.122e+01, 9.720e+01, 5.347e+01, 1.028e+02 | 6.750e+01, 2.786e+01
2     | X2   | Uniform |  5.684e+00, 1.085e+01                                           | 8.264e+00, 1.490e+00
3     | X3   | Laplace |  4.965e-02, 6.950e-04                                           | 4.965e-02, 9.829e-04

**Error: the specified input does not seem to be either a string nor a recognized object!

Error in uq_PCE_initialize (line 83)
error(‘Error: the specified input does not seem to be either a string nor a recognized object!’)**

Index | Name | Type     |  Parameters             | Moments              
--------------------------------------------------------------------------
1     | X1   | Rayleigh |  5.100e+01              | 6.392e+01, 3.341e+01
2     | X2   | Uniform  |  5.684e+00, 1.085e+01   | 8.264e+00, 1.490e+00
3     | X3   | Laplace  |  4.965e-02, 6.950e-04   | 4.965e-02, 9.829e-04

**Calculation of parameters from moments is not defined for marginal type: Rayleigh !

Error in uq_MarginalFields (line 49)
error(‘Calculation of parameters from moments is not defined for marginal type: %s!’,
…
**

xujia · June 9, 2022, 7:45am

Hi @felix,

Thanks for asking here. I cannot reproduce the reported problem. Could you please provide more details of the code that triggers the error?

Xujia

felix · June 9, 2022, 8:25am

Cang.zip (3.2 KB)
Dear Xujia,
@xujia, I am very sorry for my vague expressions.

First, I inferred the marginal distribution of the parameters based on uq_Example_Input_06_inferMarginals as follows:
（InputOpts.Marginals.Type = ‘auto’;）

Index | Name | Type     |  Parameters             | Moments              
--------------------------------------------------------------------------
1     | X1   | Rayleigh |  5.100e+01              | 6.392e+01, 3.341e+01
2     | X2   | Uniform  |  5.684e+00, 1.085e+01   | 8.264e+00, 1.490e+00
3     | X3   | Laplace  |  4.965e-02, 6.950e-04   | 4.965e-02, 9.829e-04

and then, I set the inferred marginal distribution as prior knowledge into a probabilistic input model：

InputOpts.Marginals(1).Name = 'X1';
InputOpts.Marginals(1).Type = 'Rayleigh '; %Uniform\Gaussian\Lognormal\Gumbel
InputOpts.Marginals(1).Moments = [ 6.392e+01, 3.341e+01];

InputOpts.Marginals(2).Name = 'X2';
InputOpts.Marginals(2).Type = 'Uniform' ;
InputOpts.Marginals(2).Moments = [8.264e+00, 1.490e+00];

InputOpts.Marginals(3).Name = 'X3';
InputOpts.Marginals(3).Type = 'Laplace';
InputOpts.Marginals(3).Moments = [4.965e-02, 9.829e-04];

Finally, the PCE model is built based on this, but it reports the following error：
**Calculation of parameters from moments is not defined for marginal type: Rayleigh !

Error in uq_MarginalFields (line 49)
error(‘Calculation of parameters from moments is not defined for marginal type: %s!’,
…
**

Similarly, I inferred them based on another method (InputOpts.Marginals.Type = ‘ks’;) ：

Index | Name | Type    |  Parameters                                                     | Moments              
-----------------------------------------------------------------------------------------------------------------
1     | X1   | ks      |  2.907e+01, 6.122e+01, 6.122e+01, 9.720e+01, 5.347e+01, 1.028e+02 | 6.750e+01, 2.786e+01
2     | X2   | Uniform |  5.684e+00, 1.085e+01                                           | 8.264e+00, 1.490e+00
3     | X3   | Laplace |  4.965e-02, 6.950e-04

and another error was reported that：

**Error: the specified input does not seem to be either a string nor a recognized object!

Error in uq_PCE_initialize (line 83)
error(‘Error: the specified input does not seem to be either a string nor a recognized object!’)**

Cang_PCE_05.m (7.0 KB)

I would like to express my sincere gratitude for any help you can give me!

bests

xujia · June 13, 2022, 2:51pm

Dear @felix,

Thanks for the additional materials. When you infer the distributions with UQLab, an Input object is created, and you can pass it directly to the next stage for surrogate modeling (you do not need to define a new one). Regarding the reported errors, please find my answer below:

The first issue comes from the additional ‘space’ when defining the Rayleigh distribution: you should use InputOpts.Marginals(1).Type = 'Rayleigh' instead of InputOpts.Marginals(1).Type = 'Rayleigh ' (please remove the space after Rayleigh)
Unfortunately, I cannot reproduce this error when defining the first random variable by kernel smoothing, namely, InputOpts.Marginals.Type = 'ks'. Nevertheless, according to the error message, the function uq_PCE_initialize did not find a UQLab input object. To avoid any possible issues, you can use MetaOpts.Input = myInput to specify the input object.

Besides, after looking at your data, I realized that you only have only 6 data points. If these data were used in the inference, I think it is far too few for robust estimation. In this case, I suggest generating/collecting more samples or defining the input distribution based on some prior knowledge. Moreover, only 6 data points are not enough for constructing an accurate surrogate model, and 2 validation points are too few to assess the generalization error of the surrogate.

I hope this helps.

Xujia

felix · June 13, 2022, 9:59pm

dear Xujia,
Thank you very much for your reply, it helped me tremendously.
I comment the probabilistic input model module on the basis of your suggestion and replace it directly with the code for marginal inference, the result is satisfactory!
In addition, I added 7 sets of data based on the original 8 sets as follow, the bolded five groups are used as the validation set, a total of 15 datasets were used to construct the PCE model.

As far as my previous work, the amount of data is sufficient, but the results are very confusing to me.
[dateset1]

a	b	c		lc
29.070 	6.329 	0.04997		4
61.220 	7.504 	0.04849		10
61.220 	7.847 	0.04984		10
97.200 	9.480 	0.04953		44
53.470 	7.376 	0.04945		8
102.800 	10.200 	0.05183		52
**43.070 	7.009 	0.05012		10**
**99.090 	9.729 	0.05085		50**

[dateset2]

a	b	c		lc
88.560 	7.725 	0.05188		18
94.670 	6.345 	0.04388		10
82.360 	4.940 	0.03764		8
104.900 	6.092 	0.04184		56
**119.500 	7.637 	0.04544		78**
**102.700 	7.369 	0.04669		40**
**101.500 	6.400 	0.04187		24**

The accuracy of the PCE model constructed based on [dateset1] is as follows, but the reliability is insufficient due to the data points is too few, the PCE prediction value is very different from the validation value, although the validation error value is small enough.

 Leave-one-out error:          3.5505929e-02
   Modified leave-one-out error: 1.3426940e-01
   Validation error:             1.3059044e-02

When I constructed the PCE using the [dateset1 and 2] , it showed a significant degree of reduction in accuracy.

  Leave-one-out error:          3.3504323e-01
   Modified leave-one-out error: 7.9010776e-01
   Validation error:             6.1208222e-01

These data sets are derived from specific analysis of finite element models, therefore, the marginal type of the input parameter can only be derived by inference.

InputOpts.Marginals.Type = 'auto';  %kernel smoothing (ks)\auto
InputOpts.Inference.Criterion = 'BIC';  %KS\BIC

InputOpts.Marginals(1).Inference.Data = X(:,1);
InputOpts.Marginals(2).Inference.Data = X(:,2);
InputOpts.Marginals(3).Inference.Data = X(:,3);

InputHat = uq_createInput(InputOpts);
uq_print(InputHat);

I can’t figure out why the accuracy of the PCE model is greatly reduced when the sample size is increased?

From my point, I have thought about the following reasons:

List item

Poor selection of validation set

List item

The sample size is still insufficient (more sample size is needed to infer marginal types than known marginal types)

List item

The data itself is not suitable for constructing surrogate models

Any helps I would appreciate！

bests

xujia · June 15, 2022, 7:24pm

Hi @felix,

I think the sizes of the training and validation set are too small to draw conclusions. Besides, is your model deterministic, i.e, the same input parameters have a unique corresponding value of the output?

Best,
Xujia

felix · June 17, 2022, 8:46am

Hi,
@xujia
Have a nice day!

Model inputs and outputs do have a one-to-one correspondence.

I’ll try to re-model when I’ve processed all the data, but due to the complexity of the numerical simulation, I can only construct up to 27 data sets, and I’m not sure if I can build a PCE model with sufficient accuracy.

In addition, if the numerical simulation changes the model and loading conditions, but the core problem of the study is the same, and the inputs and outputs are obtained based on the post-processing of the numerical model results, can such a case be unified to establish DD-PCE?

Or do I have to construct multiple PCE models separately? If so, it will face the situation of insufficient data size.

Many thanks! hope you have a nice weekend.

bests