Generating feature space diagrams in R
This is a first effort to implement "feature space diagrams" in R, inspired by: https://towardsdatascience.com/escape-the-correlation-matrix-into-feature-space-4d71c51f25e5
Our workflow is generally as described in the original post:
- Generate the correlation matrix
- Create a distance matrix from the correlation matrix
- Original approach: Take the absolute value of correlation matrix and subtract each value from 1.
- Revised approach: use R's
dist()
function to provide different approaches to finding distance
- Use PCA to reduce our NxN distance matrix to Nx2.
- Plot each feature’s location using the first two principal components.
- Use Feature Agglomeration (hierarchical clusterimg) to generate feature clusters.
- Color each feature by its cluster.
- Draw lines to represent relationships based on some threshold
Many thanks to Win Cowger and the rest of the #Rstats Fediverse community for the inspiration!