Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Discrepancy in training data versions #107

Open
vaibhavad opened this issue Jan 30, 2024 · 0 comments
Open

Discrepancy in training data versions #107

vaibhavad opened this issue Jan 30, 2024 · 0 comments

Comments

@vaibhavad
Copy link

Thank you for the great work and releasing the datasets and models. I downloaded the MEDI dataset few months ago and the length of the dataset in that file is 1435000

When I download it today, the dataset size is 1240000.

What is the difference between these two versions? Are there some samples which have been discarded? If so, where they from any specific dataset? Have any new samples been added?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant