Choleski-Decomposition #1068

Koenig128 · 2023-11-09T14:04:20Z

Dear Jacob,

I ran into a problem with densehmm and I saw that some previous issues dealt with similar problems - but because of your recent rewrite, I am not sure if that was fixed and I am doing something wrong. I would appreciate if you could take a look.

A minimal example:

np.random.seed(42)
data = np.random.rand(20000)
df = pd.DataFrame(data, columns=['value'])
sin_values = np.sin(2 * np.pi * df.index / 24)
cos_values = np.cos(2 * np.pi * df.index / 24)
sin_normalized = (sin_values - sin_values.min()) / (sin_values.max() - sin_values.min())
cos_normalized = (cos_values - cos_values.min()) / (cos_values.max() - cos_values.min())

df['sinday'] = sin_normalized
df['cosday'] = cos_normalized

data_array = df.values
tensor_3d = data_array[np.newaxis, :, :]

model = DenseHMM([Normal(), Normal(), Normal(), Normal(), Normal(),
                 Normal(), Normal(), Normal(), Normal(), Normal()], max_iter=10, verbose=True)
model.fit(tensor_3d)

The sinus and cosinus functions are for encoding time information.

I am getting the following error:
_LinAlgError: linalg.cholesky: The factorization could not be completed because the input is not positive-definite (the leading minor of order 3 is not positive-definite).

I do not quite understand the problem. It works (sometimes) if I reduce the number of states:

model = DenseHMM([Normal(), Normal(), Normal(), Normal(), Normal(), Normal(), Normal(), Normal(), Normal(), Normal()], max_iter=10, verbose=True) model.fit(tensor_3d)

But I would need more states because my real data are more complex. Do you have any suggestions?

Thank you very much in advance!

The text was updated successfully, but these errors were encountered:

jmschrei · 2023-11-09T16:14:59Z

This is a recurring problem. Basically, when you have many more states than you need to model your data, some states do not get assigned many/any points because the rest of the states model the data well. Having too few examples can cause an issue with the Cholesky decomposition if there isn't enough variance among them. I know scikit-learn gets around this using some trick -- I need to get around to implementing that. In the meantime, it's basically a sign that your model has too many parameters. If you had more complex real world data you'd be less likely to see this error because the number of states you chose would be needed to capture the heterogeneity of it.

AKuederle · 2023-11-09T16:51:13Z

I just had a look in the sklearn-code out of interest. They basically add a small value (reg_val) to the terms of the covariance matrix. For example see here: https://github.com/scikit-learn/scikit-learn/blob/093e0cf14aff026cca6097e8c42f83b735d26358/sklearn/mixture/_gaussian_mixture.py#L153

This value is set to 1-6 by default

    reg_covar : float, default=1e-6
        Non-negative regularization added to the diagonal of covariance.
        Allows to assure that the covariance matrices are all positive.

(from https://github.com/scikit-learn/scikit-learn/blob/093e0cf14aff026cca6097e8c42f83b735d26358/sklearn/mixture/_gaussian_mixture.py#L510)

In case the decomposition still fails, the error message suggests to either reduce the number of states, or increase the value of reg_covar

jmschrei · 2023-11-09T16:54:41Z

Yeah that's what I thought... I'll try adding that in soon. In the meantime, pomegranate is modular enough that you can just copy/paste the Normal class, add this in yourself, and drop those objects into native pomegranate objects.

AKuederle · 2023-11-09T16:59:18Z

If I understood correctly, it should be possible to subclass Normal and just add a small value to covs calculated in this method.

pomegranate/pomegranate/distributions/normal.py

Line 272 in f8ed453

def from_summaries(self):

class StableNormal(Normal):

    	def from_summaries(self):
		"""Update the model parameters given the extracted statistics.

		This method uses calculated statistics from calls to the `summarize`
		method to update the distribution parameters. Hyperparameters for the
		update are passed in at initialization time.

		Note: Internally, a call to `fit` is just a successive call to the
		`summarize` method followed by the `from_summaries` method.
		"""

		if self.frozen == True:
			return

		means = self._xw_sum / self._w_sum

		if self.covariance_type == 'full':
			v = self._xw_sum.unsqueeze(0) * self._xw_sum.unsqueeze(1)
			covs = self._xxw_sum / self._w_sum -  v / self._w_sum ** 2.0

		elif self.covariance_type in ['diag', 'sphere']:
			covs = self._xxw_sum / self._w_sum - \
				self._xw_sum ** 2.0 / self._w_sum ** 2.0
			if self.covariance_type == 'sphere':
				covs = covs.mean(dim=-1)
		
                # This is the magic :)
                covs += 1e-6 

		_update_parameter(self.means, means, self.inertia)
		_update_parameter(self.covs, covs, self.inertia)
		self._reset_cache()

jmschrei · 2023-11-09T17:00:05Z

Yes, that's fine too. I was thinking you might want to add a user-defined covariance but hardcoding some small number is fine too if it works.

jmschrei · 2023-11-09T17:01:24Z

Sorry, but to be clear you probably want to add covs += numpy.eye(covs.shape[0]) * 1e-6 under the full covariance type, right? You just want to add it to the diagonal of the covariance matrix, not every element.

AKuederle · 2023-11-09T17:06:37Z

Yes, that's fine too. I was thinking you might want to add a user-defined covariance but hardcoding some small number is fine too if it works.

Yes, that would of course be the more elegant version. On that note, the ditribution has a parameter called min_cov which seems to be unused. Was this maybe intended for exactly this workaround?

Sorry, but to be clear you probably want to add covs += numpy.eye(covs.shape[0]) * 1e-6 under the full covariance type, right? You just want to add it to the diagonal of the covariance matrix, not every element.

Ah yes, you are right, I misread the scikit-learn code! Only the diagonal makes way more sense.

Koenig128 · 2023-11-13T07:55:46Z

Dear Arne and Jacob,

thank you very much for your clarifications and helpful suggestions! I really appreciate your efforts!

AKuederle · 2023-11-14T11:43:10Z

Btw. is this a "duplicate" of #1039 ?

jmschrei · 2023-11-14T16:14:11Z

Not exactly a duplicate. min_cov doesn't necessarily fix the issue but could. Regardless, min_cov should be added in too.

Koenig128 closed this as completed Dec 4, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Choleski-Decomposition #1068

Choleski-Decomposition #1068

Koenig128 commented Nov 9, 2023 •

edited

Loading

jmschrei commented Nov 9, 2023

AKuederle commented Nov 9, 2023 •

edited

Loading

jmschrei commented Nov 9, 2023

AKuederle commented Nov 9, 2023 •

edited

Loading

jmschrei commented Nov 9, 2023

jmschrei commented Nov 9, 2023 •

edited

Loading

AKuederle commented Nov 9, 2023

Koenig128 commented Nov 13, 2023

AKuederle commented Nov 14, 2023

jmschrei commented Nov 14, 2023

Choleski-Decomposition #1068

Choleski-Decomposition #1068

Comments

Koenig128 commented Nov 9, 2023 • edited Loading

jmschrei commented Nov 9, 2023

AKuederle commented Nov 9, 2023 • edited Loading

jmschrei commented Nov 9, 2023

AKuederle commented Nov 9, 2023 • edited Loading

jmschrei commented Nov 9, 2023

jmschrei commented Nov 9, 2023 • edited Loading

AKuederle commented Nov 9, 2023

Koenig128 commented Nov 13, 2023

AKuederle commented Nov 14, 2023

jmschrei commented Nov 14, 2023

Koenig128 commented Nov 9, 2023 •

edited

Loading

AKuederle commented Nov 9, 2023 •

edited

Loading

AKuederle commented Nov 9, 2023 •

edited

Loading

jmschrei commented Nov 9, 2023 •

edited

Loading