Increase sampling points using previous results

sebacastroh · October 6, 2020, 5:06pm

Hi all,

I have a model (6 inputs) and I did a PCE analysis using 3000 random points (LARS, LHS), which it took 2 days. I want to increase the number of sampling points but also, considering the computational time, I’d like to use my previous results. Do you know how can I use my results for a second analysis with more points?

Thanks in advance

damarginal · October 6, 2020, 9:49pm

Hi @sebacastroh ,

I assume those two days were spent in computing the full computational model on the 3'000 points so indeed it makes sense to reuse them.

Suppose you have the LHS design of 3'000 sample points stored in the variable X and an INPUT object used to create the sample stored in variable myINPUT. You can add additional LHS points to the using the function uq_enrichLHS as follows:

Xn = uq_enrichLHS(X, Nn, myInput)

where Nn is the number of additional points you want to add to the design. The function will return only those new points. You can then use these new points to compute the full computational model (using uq_evalModel to get Yn). After concatenating the new points (both Xn and Yn) to the original design, you can then use them recompute the PCE model (i.e., using existing data).

uq_enrichLHS basically tries to find new points in a given LHS design such that the newly enriched design will also be an LHS. This is not always possible and the resulting design would most probably be a pseudo LHS. Suppose you want to enrich your already enriched LHS design again, you might encounter the following error:

X = [X; Xn];  % concatenate the enrichment points to get the enriched design
Xn = uq_enrichLHS(X, 2, myInput)
Error using uq_enrichLHS (line 88)
The initial sample set does not form a Latin Hypercube! For eniriching such set in a pseudo-LHS fashion use uq_LHSify function instead.

In this case, as suggested by the error message, the function uq_LHSify can be used in place of uq_enrichLHS.

I hope this helps!

bsudret · October 7, 2020, 6:51am

Dear Sebastian

@damarginal replied already about the enrichment strategies.

I’d like to comment on your setup, i.e 3,000 samples points for a 6-dim problem. Based on our (long long) experience in PCE, it should be already more than enough to get a good PCE. Actually for d=6 dimensions, I would have started with 200 sample points ! Did you use ‘LARS’ as a solver to get a sparse PCE? Did you check the error estimators provided by UQLab (leave-one-out error)?

If the “best usage” of PCEs does not provide you with an accurate surrogate (with this size of experimental design N= 3,000), then increasing it will probably not help much. There may be complex mechanisms that require tricks such as a change of variables (e.g. take the log of the output) etc.

Best regards
Bruno

sebacastroh · October 7, 2020, 12:29pm

Hi @damarginal, thank you for your response, it is exactly what I need.

Dear @bsudret, thanks for your comments and I understand your concerns. I’d like to add that the problem I’m working is highly non-linear. Me and my team developed a seismic risk model in a transportation network, and we are interested to evaluate the UQ and SA of these six parameters. So we are analyzing the resulting rates of exceedance for several travel times (from 2 to 10 hours, 1001 points). We have notice that the leave-one-out error for travel times larger than 3 hours is lower than 2%, but between 2 and 3 hours this error is around 6%. While it makes sense to me that the error at short travel times is larger than in long travel times, I’m not sure if this error is too big or it is considered acceptable. So my plan was to decrease this error increasing the number of points while searching which is an acceptable error. Maybe you can help me with this question, or you can suggest me some literature to check.

Thanks again to both of you for your help,

Kind regards