# Infer Non-parametric Marginals and Copula

Dear UQLab

First and foremost, I really appreciate the sustainable support of UQTeam. In particular, @damarginal and @torree for their fruitful comments in my previous questions about Copula subject.

I have two questions:

1- Based on the discussion in UserManual_Inference (sec: 1.3.1), however, it is possible in UQLab to infer marginal based on the nonparametric approach (i.e. KDE), it is also warned about the curse of overfitting.
I was wondering to know about the pros and cons of using nonparametric approaches. In other words, does this option is considered a reliable approach in inferencing of marginals or not? Why?
2-Besides, is it possible to infer “ Non-parametric Copula” in UQLab? How?

Best regards
Ali

Dear @ali,

1. this is a difficult one, which very much depends on the context (type and amount of data samples, for instance) and can’t be answered generally. I’ll try to give a general answer nevertheless, not knowing further details on your problem.
In general, I can recommend spending a bit of time looking at your data, one dimension at a time. For each marginal, have a look at the empirical distribution (plot an histogram): does it follow something resembling a classical distribution? Then give that distribution family a try, and check the quality of the fit. Does it have instead a weird shape? Probably a non-parametric fit will serve you best. You can of course fit many distribution families and choose based on their goodness of fit (AIC or BIC). This is in the end what the automated fitting of uqlab does. This said, have in mind that no fitted distribution, parametric or not, will ever be a perfect reresentation of your data. So maybe first ask yourself: are you most interested in the bulk of the distribution, or in capturing the tails (e.g. for rare event estimation?). In the latter case, you may want to limit your selection to marginal distributions with heavy tails, such as t- or Gumbel. Of you are unsure which ones they are, check out the Input manual.
2. non-parametric copulas are non supported in uqlab just yet. While the uqlab team is working on it, I can say the already supported parametric families cover most practical cases. The vine copulas in particular are extremely flexible families, and if you have a problem in dimension 3 or larger, you’ll want to give them a try. Again, automated fitting in uqlab is based on AIC/BIC scores. Plotting the bivariate data as shown in the input manual may also help. And again, give a think to whether you care more about the overall distribution, or rather the tails.

Happy UQLab,
Emiliano

2 Likes

Dear @torree

First and foremost, I really appreciate your detailed reply. In particular, your discussion about the “Rare Event Estimation”. Is there any chance for me to have your guidance on the below questions?

1- As you are well aware, choosing some specific marginal could help us to capture the “tail dependency”. Assume a hypothetical situation in which UQLab proposes a lognormal distribution (based on his GOF tests). The challenge begins here:
Is it possible to use marginals such as Gumbel to investigate the tail dependency which shows worse AIC or BIC than the proposed UQLab lognormal distribution to show tail dependency?
In a sentence, could our purpose overwhelm the GOF tests to show the tail dependency?
2- In the context of nonparametric marginal, consider UQLab suggests one parametric distribution (e.g Logistic) which shows good GOF, if I use the “nonparametric inference marginal” option in UQLab (iOpts.Marginals(i).Type = ‘ks’ ), Does the probability of failure (Pf) which comes from the nonparametric approach show reliable results (Pf) or not?
In other words, if I use two different approaches (i.e., parametric and nonparametric) and face two different results (Pf) from the reliability analysis which one is correct?
(Assume the parametric inference marginal approach satisfies the GOF tests)

Last but not least, thanks for your enjoyable and state of the art platform, UQLab.
Best regards
Ali