Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a "drop_similar" argument to the TableVectorizer #1249

Open
GaelVaroquaux opened this issue Feb 26, 2025 · 0 comments
Open

Add a "drop_similar" argument to the TableVectorizer #1249

GaelVaroquaux opened this issue Feb 26, 2025 · 0 comments

Comments

@GaelVaroquaux
Copy link
Member

It would be great to make it really easy drop the redundant columns in the TableVectorizer (Using the DropSimilar transformer inside the TableVectorizer), by a simple additional argument.

This would have improvements both is speed/memory and maybe in statistical performance.

I realize that it makes the TableVectorizer even more a swiss-army knife than it currently is, but honestly, it's sooo useful and we use it everywhere, even as an element in complex pipelines.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant