Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature Request: Optionally Store MST and Condensed MST in HDbscan Struct #64

Open
alteredoxide opened this issue Sep 15, 2023 · 1 comment

Comments

@alteredoxide
Copy link

For enhanced functionality and analysis, it would be beneficial to have the option to store the Minimum Spanning Tree (MST) and its condensed version within the HDbscan struct, along with a few new supporting methods. This would be similar to features offered by the python hdbscan library.

Some direct and indirect benefits of storing the MST:

  • Explore clusterings with different configurations using minimal computation: min_cluster_size is only used when generating the condensed tree.
  • Analyze and explore the MST to:
    • visualize data
    • automate parameter tuning
    • analyze edge weights and connectivity
    • compare other MSTs
    • leverage the MST for other algorithms
  • Explore the hierarchy of clusters:
    • this one is perhaps more applicable to the condensed tree.

I've showcased an optional MST storage in a fork; please review the changes here. Let me know if this feature is of interest to you. If so, I can also add in optional storage of the condensed tree and create a PR.

I believe this enhancement can provide more versatility to the library. I hope to collaborate and refine the feature based on feedback.

Thanks for your work on creating this great library, btw!

@hadronzoo
Copy link

This would also be useful for implementing FISHDBC (Python implementation here).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants