Problem by computing accurate statistical moments with least squares pce

dimitris_p · January 17, 2025, 1:50am

Hello everyone! i am using least squares pce to compute the mean and standard dev of lets saz for example the function x^3 with Normal distr. As far as i know the least squares method transforms the objective function in a sum of coefficients*Polynomials (Hermite for this case). The problem is if i set i low input distribution lets say 0.1 the output mean and stdev of objective function are accurate, but if i increase the input distribution to lets say 3 the method delivers an error of 20% or more. I guess i did something wrong but i i am not sure. I leave my cpp code below in a very simplistic form. If anyone could help me i would be very pleased. Thank you in advance.

#include <iostream>
#include <vector>
#include <cmath>
#include <Eigen/Dense> // For matrix operations

// Function to compute the Hermite polynomial of degree n
double hermitePolynomial(int n, double z) {
    if (n == 0) return 1.0; // H0(z) = 1
    if (n == 1) return z;   // H1(z) = z
    return z * hermitePolynomial(n - 1, z) - (n - 1) * hermitePolynomial(n - 2, z);
}

// Function to compute f(X) = X^3
double targetFunction(double x) {
    return std::pow(x, 3);
}

int main() {
    // Step 1: Define the mean and standard deviation of X_1
    double mean = 5.0;
    double stddev = 3.0;

    // Step 2: Define the standard normal samples (Z)
    std::vector<double> z_samples = { -1.5, -0.75, 0.0, 0.75, 1.5 }; // 5 samples
    int num_samples = z_samples.size();

    // Step 3: Transform Z to X_1 using the given mean and standard deviation
    std::vector<double> x_samples(num_samples);
    for (int i = 0; i < num_samples; ++i) {
        x_samples[i] = mean + stddev * z_samples[i];
    }

    // Step 4: Define the target function values
    std::vector<double> f_values(num_samples);
    for (int i = 0; i < num_samples; ++i) {
        f_values[i] = targetFunction(x_samples[i]);
    }

    // Step 5: Construct the design matrix (Psi) in terms of Z
    int num_polynomials = 5; // Using H0, H1, H2
    Eigen::MatrixXd Psi(num_samples, num_polynomials);
    for (int i = 0; i < num_samples; ++i) {
        for (int j = 0; j < num_polynomials; ++j) {
            Psi(i, j) = hermitePolynomial(j, z_samples[i]);
        }
    }

    // Step 6: Convert f_values to Eigen vector
    Eigen::VectorXd F(num_samples);
    for (int i = 0; i < num_samples; ++i) {
        F(i) = f_values[i];
    }

    // Step 7: Solve for PCE coefficients (alpha) using LSM
    Eigen::VectorXd alpha = (Psi.transpose() * Psi).ldlt().solve(Psi.transpose() * F);

    // Step 8: Display the results
    std::cout << "PCE Coefficients (alpha):" << std::endl;
    for (int i = 0; i < alpha.size(); ++i) {
        std::cout << "alpha[" << i << "] = " << alpha[i] << std::endl;
    }

    // Step 9: Compute reconstructed function and error
    Eigen::VectorXd f_reconstructed = Psi * alpha;
    double error = (F - f_reconstructed).norm() / F.norm();
    std::cout << "\nReconstruction Error: " << error << std::endl;


    double variance = 0.0;
    double final_mean =alpha[0];

    // Calculate the variance using the sum of squares of coefficients from a[1] onwards
    for (size_t i = 1; i < alpha.size(); ++i) {
        variance += alpha[i] * alpha[i] ; 

    }

    // Standard deviation is the square root of the variance
    double final_stddev = std::sqrt(variance);

    std::cout << "mean  " << final_mean << std::endl;
    std::cout << "stddev  " << final_stddev << std::endl;

    return 0;
}

styfen.schaer · January 17, 2025, 8:17am

Hi @dimitris_p

What do you mean by a ‘low’ and ‘high’ input distribution? Are you referring to the variance of the distribution?
Check your polynomials again. It looks to me like you have not normalised them (see Table 1 in the UQLab PCE manual).
To get good moment estimates, your data must come from the specified distribution. You only have five data points. Also your data is evenly spaced. The likelihood that they were drawn from a standard normal distribution seems low.
To validate your code, I would suggest drawing random samples from the intended distribution and using a few more points.
You say you are interested in the moments so I assume the 20% error are related to the moments. I do not see the comparison of the PCE moments with the true moments. What number do you compare them against?

Hope this helps a bit!

Best regards
Styfen

dimitris_p · January 17, 2025, 6:07pm

Hi @styfen.schaer , Thanks a lot! i think i got it! I have one quetion though.

I indeed didnt normalise the Polynomials because i was used to work with Projection methods like Gauss Hermite Quadrature pce which as far as i know doesnt need normilisation of the Polynomials. I just solve the integral with Gauss Quadrature rule and say: coefficients= weightsMultivariante Polynomials Obejctive function at the specific roots of Gauss Hermite Quadr. Does this normalisation rule (/root of factorial k) apply also to Gauss Hermite Quadrature method?
2.In regression methods like Least Squares can i know apriori how many samples do i need to evaluate in order to trust the result? In the methods i used to work with (Gauss Quadr.) i had the expesive full grid and the smolyak sparse grid, so i knew from before the cost of such a problem.
3. In generall, are these methods more cost-effective than projection methods with sparse grids?

PS: Parallel i read the manual its extremely helpful
Thank you in advance:)

styfen.schaer · January 20, 2025, 3:14pm

If you do not normalize the polynomials you then have to normalize the higher order moments to account for that (expect the mean which is unaffected). E.g. the variance is computed as Var[Y]=\sum_{\alpha_i \ne 0} \alpha_i^2 k!.
For OLS I would suggest to use at least 2-3 times as many samples as you have polynomial coefficients. If you use different regression methods (e.g. sparse regression such as LARS) this number can be much lower.
Projection methods can become very expensive in high dimensions. Also keep in mind that projection requires the evaluation of your model on specific points. If you already have your data or you can’t evaluate you model on arbitrary points you can’t use projection. For (complex) real world problems we made very good experience with sparse regression (LARS in particular). This can handle high input dimensions well and allows to use high polynomial degrees with relatively little data without overfitting.

Glad you found the manual helpful!