I’m trying to apply kriging to a finite element model of a simply supported plate subjected to a pressure of 1 Pa. I’m interested in the sound transmission loss of this plate using Actran software. I have three uncertain parameters: Young’s modulus, Poisson ratio and density of the plate.Unfortunately, I encounter several problems:

I notice that the more I increase the size of the experimental design, the more I diverge.

The validation error is very small compared with the LOO error, which remains larger. I use the same experimental design for both validation and experimental design.

For example, for a design size of N = 5 simulations, I obtain :
Kriging metamodel validation error: 5.6198e-20
Kriging metamodel LOO error: 2.3841e-01
This N = 5 configuration gives me the lowest LOO error.
Thank you for your help.
Best regards,
Soraya

Could you please provide more details about the divergent behavior?

The validation error is generally used to assess the performance of the surrogate on unseen data. As a result, we should use a different data set to evaluate this error. In other words, we should separate the available data into two sets: training and validation. We use the training set to build the surrogate and employ the validation set to assess the model performance. Such a procedure is known as hold-out validation. If only a few data are available, we could use cross-validation, where the leave-one-out (LOO) error is a special case.

Kriging is an “interpolator”, meaning that the model goes through the exact training points. As a result, if we use the same data set for both training and validation, the validation error is 0. the small error 5.6198\times 10^{-10} you observed comes from the Nugget effect in Kriging to avoid numerical instabilities (please refer to Section 1.4.4 of the Kriging manual for more details).

The validation error is only an estimate of the overall performance, and thus it varies if we use a different data set. If you want to compare two surrogate models, it is necessary to use the same validation set.

I think N=5 points are too few to obtain a robust surrogate as well as the error estimate (using LOO).

When I talk about “divergence”, I’m referring to the error that increases rather than decreases with increasing sample size.

I made another experimental design with N = 200 simulations and another with N = 1000 simulations for the validation design. Both designs were made with Latin hypercube sampling. I don’t get a low error. I have e-mailed you the file with the Matlab file and the experimental and validation design files in a zipped folder.

Thanks for sharing the data. I think there are a few issues:

Among these 200 training points and 1000 validation points, many duplicates of the input values can be found. More precisely, there are only 60 distinct points in the training set and 64 distinct points in the validation set, e.g., the input value (2.78\times 10^{9},0.342,1470) is repeated 23 times. However, the associated output values are NOT the same: among the 23 model runs, the output values vary between (80.29,85.18) with variance being 1.82 close to the overall variance of 2.26 of the output Y. Is your simulation stochastic, meaning that you have additional randomness in the model (please read this post)? If so, we CANNOT predict the exact output value for a given input without getting access to the latent variables. If the simulator is not stochastic, meaning that a given set of inputs should have a unique corresponding output value, please check and correct the computational model (besides, please do not round the values).

The experimental design was created on a regular grid, which does not reflect the input distribution. To increase the accuracy of the surrogate, I would strongly recommend using a more advanced strategy such as Latin hypercube sampling by X = uq_getSample(myInput, 200, 'LHS');.

Best,
Xujia

P.S. If you do not mind, please share your dataset on UQWorld so that other experienced users could also share some ideas here.

Thank you for your help. I have found my error. It was a problem of post processing. Now it is working. I want to share my data set but I can not because I am a new user. It said :“Sorry, new users can not upload attachments”.