Save/load model for GP Kriging

Hi all,

I am fitting a GP model for Bayesian inversion. The model has 3 inputs but a huge model output with 1000 time steps * 10 sensors (10000 outputs). It is taking really long time to fit the model. Any handy way to save the GP model to file and load it later instead of fitting the model every single time opening Matlab? I am Python guy, so pardon me if not knowing deep Matlab tricks. Btw, I tried “save(‘GP.mat’, myGP)” but Matlab failed to save the file.

Also, if you know of any faster/reliable surrogate in UQlab than GP/Kriging that can be better for huge output spaces, feel free to suggest! It is know taking about 4 hours to fit.

Thank you!
Majdi

Hi @radaideh,
Is the computation taking 4 hours to fit the GP to the whole output space (10 000 model evaluations)?
If yes, then I would say it is still a good time, considering it is done in Matlab.
If no, then I suppose it takes 4 hours to compute a surrogate for each time step, meaning 4 hours times 1000 steps for the whole surrogate evaluation. This would be a lot, I agree. I also suppose your output matrix is Ns x Nout, where Ns is the number of the sensors and Nout is the number of time steps.

Regarding saving the file, Matlab has some problems when it is about saving a considerable amount of data or big structures. Try to use the -v7.3 option, which avoids file compression and creates, therefore, fewer issues. This should make no problems:

save('fileName.mat','-v7.3')

I believe that having so many time steps is the core issue here. Developing a surrogate for so many instances is computationally expensive. If you do not need the complete time expansion of the process, you could try discretizing your output, e.g., considering only data every 10 time steps.
Alternatively, if you need all of the time steps, I suggest to compute a separate surrogate for each time step, saving it, and moving to the computation of the next one iteratively.
Again, discretizing your output is the best option for you.

Let me know
Gian

1 Like

Hi Gian,

Thanks for your reply and suggestions! Yes, it takes 4 hours to fit all 10000 outputs (I agree with you, I saw this long coming from Python). I already aggregated the time series by averaging every 10 steps so I don’t lose much resolution, so now my model outputs are 1000, which is a bit better.

Also, your suggestion regarding model saving works, thank you! Unfortunately, GP simply does not seem good for my problem as I started to encounter optimizer problems (for BFGS, HGA) not being able to search after I collected more samples for the surrogate, and I gave up trying to adjust the source code. Nevertheless, I gave PCE a try given my input space is small, as it seems quite fast, and I got fantastic results of test R2=0.95. I could not ask for more :slight_smile:

I will move with PCE for now!

Thanks again for the suggestions!
Majdi

1 Like

Perfect. Happy to help :slight_smile:
Have fun with PCE.
Best
Gian

1 Like