Error when the number of components is zero #1

zhiyintan · 2024-11-03T15:04:10Z

When length of self.component_vectors = 0 (hdbscan_min_cluster_size/hdbscan_min_samples is set too large)

IndexError: list index out of range

semantic_components/semantic_components/decomposition.py

Line 511 in e5c1c58

return self.component_vectors[i]

The text was updated successfully, but these errors were encountered:

eichinflo · 2024-11-04T18:04:08Z

Thank you for bringing this to our attention! In your opinion: What would be the expected behavior if a representation of a non-existent component is requested here? Thinking about it, throwing a more expressive error message is probably the correct behavior.

What I am also concerned about is the naming of the function as "representation" here refers to the cluster centroid and not a c-tf-idf representation. I think I'll rename this and also add a function to the SCA class that offers the same functionality of exposing the cluster centroids.

decomposition.ClusterDecomposer.get_component_repr method to avoid errors when no components are found. Adressing issue #2: We've added functionalityto the `SCA` initialization method to allow for custom tokenizers. We've alos added respective test cases and notes in the README.

eichinflo · 2025-01-08T09:23:58Z

I've just added some lines to address the first part of this.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Error when the number of components is zero #1

Error when the number of components is zero #1

zhiyintan commented Nov 3, 2024

eichinflo commented Nov 4, 2024

eichinflo commented Jan 8, 2025

Error when the number of components is zero #1

Error when the number of components is zero #1

Comments

zhiyintan commented Nov 3, 2024

eichinflo commented Nov 4, 2024

eichinflo commented Jan 8, 2025