cholesky factorization error on cpu #35

pumplerod · 2023-10-17T05:50:58Z

there seems to be some limit between the n_components and n_features. If I try and create a model with

n_components=1
n_features=99

it will fail with _LinAlgError: linalg.cholesky: The factorization could not be completed because the input is not positive-definite (the leading minor of order 22 is not positive-definite).

reducing n_features=98 will work but then if I raise n_components=2 the error returns.

I am trying to work with many more features and components. Potentially 1000+ features and an unknown number of components, but it is likely to be high. Is there any workaround for this?

The text was updated successfully, but these errors were encountered:

pumplerod · 2023-10-17T05:54:10Z

It appears to have something to do also with the number of samples. In my example I was using 100 samples, but if I increase that then the error goes away. I guess I need to keep my n_samples higher than n_components + n_features or something like that. Still tricky to work around.

AndreasGerken · 2023-11-23T14:05:21Z

I have the same issue. It appears when using cpu or cuda.
Seems like it appears with high n_components or n_features in comparison to samples.

Is there a possibility to fix it? Or some rule how to avoid it?

TOM-tym · 2024-05-28T16:35:17Z

Hi there,
It has been several months, but maybe changing the dtype from PyTorch's default float32 to float64(double) would help in some cases.
I think the issue comes from some eigenvalues of the var here close to zero, so the Cholesky factorization will have some numerical issues.

gmm-torch/gmm.py

Line 299 in 23eaf64

    
           log_det[k] = 2 * torch.log(torch.diagonal(torch.linalg.cholesky(var[0,k]))).sum()

So changing to `double' will help to alleviate the numerical issues.

I change self.mu, self.var, self.pi to double in _init_params by adding .double() after their initializations.
e.g.,
self.mu = torch.nn.Parameter(torch.randn(1, self.n_components, self.n_features) requires_grad=False) ==>
self.mu = torch.nn.Parameter(torch.randn(1, self.n_components, self.n_features).double(), requires_grad=False)

and make sure the input of function fit() is also in double.

However, this is a kind of temporary trick and indeed increases the running time and memory occupation.
Hope this helps.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

cholesky factorization error on cpu #35

cholesky factorization error on cpu #35

pumplerod commented Oct 17, 2023

pumplerod commented Oct 17, 2023

AndreasGerken commented Nov 23, 2023

TOM-tym commented May 28, 2024

cholesky factorization error on cpu #35

cholesky factorization error on cpu #35

Comments

pumplerod commented Oct 17, 2023

pumplerod commented Oct 17, 2023

AndreasGerken commented Nov 23, 2023

TOM-tym commented May 28, 2024