ANCOVA Error: problem dimension undefined

Hi there,

I’m using UQ[py]Lab for sensitivity analysis. Below please find a summary of the problem statement:

  • Number of input variables: 30
  • Number of output: 1
  • Sample size: 7600
  • I have obtained the data for input variables and output, so I supplied this data as samples instead of evaluating the model response on the fly.

As a preliminary check, I first tried the input/output correlation method. This worked successfully, and the results look ok.

I then wanted to try ANCOVA, because my input variables are dependent. Below is the code I used:

# set up uqpylab: -------
mySession = sessions.cloud(host=UQCloud_instance, token=myToken)
mySession.timeout = uqpylab_timeout_new; 
mySession.reset() 
uq = mySession.cli
uq.rng(100, 'twister');  

# set up options for SA: -------
ANCOVAOpts = {
    'Type': 'Sensitivity',
    'Method': 'ANCOVA'
}

ANCOVAOpts['ANCOVA'] = {
    'Sampling': 'LHS',   # default, there are other options
    'MCSamples': 10000,  # defualt, number of samples used for ANCOVA (Monte Carlo estimation of the indices);
    'Samples': {
        'X': input_samples.tolist(), # should be N * M float
        'Y': output_samples          # should be N * N_out float
        }
    }

ANCOVAOpts['ANCOVA']['PCE'] = {
    'Degree': list(range(1,11)), # range of degrees to be tested, degree-adaptive 
    'TruncOptions': {'qNorm': np.arange(0.5,1.01,0.1).tolist()}, # q-norms to be tested, q-norm-adaptive
    'Method': 'LARS' # sparse PCE by LARS
    }

# run the sensitivity analysis:-------
myANCOVAAnalysis = uq.createAnalysis(ANCOVAOpts)

However, the code above kept throwing me error (please see the attached screenshot). I checked my code and searched the manuals, but I still couldn’t figure out why. By any chance, can anyone help me with this?

Thank you in advance!

I am attaching another screenshot that shows the ANCOVAOpts, in case this can be helpful. Thanks again.

Dear @Xin-Yue_Wang ,
from a quick look, it seems to me that you haven’t defined any input distribution/input object.

Without it, it is not possible to create a PCE, which is a necessary ingredient for ANCOVA.

Best regards,
Stefano

Hi @ste,

Thank you for taking the time to look into this. You are spot on! I didn’t realize that I have to define the input distribution/object. I thought that I don’t need to define the input, since I don’t need to run a model to generate output samples.

Following your hints, I have added the code to define the input distribution/object. Based on this post by Dr. Sudret, I defined the input by performing statistical inference for the 30 input variables with 7600 samples. I used ‘auto’ for searching marginals and specified Gaussian copula. The inference was successful and pretty fast.

Then I moved on to run the code for ANCOVA. But I found that this is pretty slow and got error saying Timeout reached. In my case, I have already increased the timeout to 10 minutes.

Therefore, I made two code changes for the set-up of uqpylab:

set up uqpylab: -------

mySession = sessions.cloud(host=UQCloud_instance, token=myToken, force_restart=True) # change 1: add force_restart to be safe
mySession.timeout = 30*60; # change 2: increase to 30 min
mySession.reset()
uq = mySession.cli
uq.rng(100, ‘twister’);

However, when I run the code above to connect to uqpylab, it was much slower than before (I tried twice). But the connection was successful eventually. Then I ran the code for statistical inference and ANCOVA. But I got another error, please see the attached screenshot. It seems that I’ve hit some size limit.

Based on above, I have the following questions:

  1. Do you happen to know why the connection became slower? Was this caused by my code or my history operation? Do you have any suggestion on how I can fix this?
  2. The latest erorr seems related to my PCE problem instead of technical issues of UQpyLab. I was wondering if you’ve experienced this before? My guess is that maybe I have too many input variables (30) and thus the large size, but I’m not sure.
  3. The previous error I got says ‘problem dimension undefined’, and this was resolved after I define the input. However, the error message seems to say only the problem dimension is required, while the marginals and copula are not necessary (though we obtained them) for PCE or ANCOVA. Could you confirm if my understanding is correct? If they are used, I may have to more carefully set up the statistical inference options.

When you have a moment, could you please provide some pointers for my questions? I’d greatly appreciate it.

Thank you very much for your help!

Sincerely,
Xinyue