Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create t-test comparison across clustered features. #56

Open
EricMartin827 opened this issue Apr 1, 2020 · 1 comment
Open

Create t-test comparison across clustered features. #56

EricMartin827 opened this issue Apr 1, 2020 · 1 comment
Assignees
Labels

Comments

@EricMartin827
Copy link
Collaborator

To evaluate how well a clustering algorithm produces classifiable features, we need a way to test for statistically significant differences in cluster assignments across classes. Write a t_test function which measures the difference in cluster assignment over all features between healthy control patients and schizophrenic positive patients.

If there are X healthy and Y schizophrenic patients with D features (clusters assigned to time window/interval), then this function will produce a 1-D array of p_values comparing cluster assignment means between the two sets of patients.

@EricMartin827 EricMartin827 self-assigned this Apr 1, 2020
@bbradt
Copy link
Owner

bbradt commented Apr 2, 2020

looks good!

It would be cool if you could generalize the T-Test so that we can also apply it to the cluster-centers between classes. For clustering, I get out a set of K cluster centers in COMPONENT x COMPONENT space. If I take instances belonging to only one class within one cluster, and do a two-tailed t-test between these class-specific instances, I should get backed a COMPONENT x COMPONENT significance matrix, that will show us differences within the clusters themselves.

This isn't necessarily useful for informing supervised learning, but it's something we do to evaluate differences between the populations, so it's worth doing if it's not too difficult.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants