Hierarchical_Clustering

Code to perform hierarchical spectral or Kmeans clustering using the sklearn library.

This code was developed as part of the DRUID project but this contains the generic code for any hierarchical clustering problem.

hierarchical_clustering.py - this is the code that does the clustering. There are 2 loops: the outer loop decides how many levels of the hierarchy to go down, the inner loop decides the optimal number of clusters to split the current cluster data into. If there are a small number of data points in the samples that might be a long way from the main clusters, these can be treated as outliers (spectral clustering is able to pull these out whereas Kmeans clustering in our experience generally does not). Outliers are all gathered into one cluster under root allowing the user to decide what to do with them at a later date.

scores.py - defines a number of different scores that can be used to test whether the clustering is good. The user can choose which score to use for the inner loop and outer loop. Available scores are Wemmert-Gancarski, Calinski-Harabaz, Silhouette, and Davies-Bouldin.

cluster_tree.py - defines a node in the cluster tree. This inherits from anytree.Node allowing printing and plotting of the hierarchy. This file also contains the code to read and write the cluster tree from/to an hdf file.

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
LICENSE		LICENSE
README.md		README.md
cluster_tree.py		cluster_tree.py
environment.yml		environment.yml
hierarchical_clustering.py		hierarchical_clustering.py
scores.py		scores.py
test_cluster_tree.py		test_cluster_tree.py
test_hierarchical_clustering.py		test_hierarchical_clustering.py
test_scores.py		test_scores.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Hierarchical_Clustering

About

Releases

Packages

Languages

License

cemac/Hierarchical_Clustering

Folders and files

Latest commit

History

Repository files navigation

Hierarchical_Clustering

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages