-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Turn MTEB-Arena-logs into HF dataset? #25
Comments
What do the logs contain again / why are we saving them? I'm less familiar with the logging package, is there a way to change the file we log to mid-run? If so, I can add something to change log files daily and upload the previous one. |
Good point, maybe we don't need to save the logs? It gives us exact timestamps of when what happened but I think we also have timestamps in the results so could remove them? Should I just add them to gitignore and remove? Another one is the results which is at 5.7M right now. It should be fine to only store
|
Yes, there are timestamps in the results, so I think we could filter to get exact times from that. Is there other information that we want to save from the logs or just the results? For debugging I assume you can keep them locally to see if there are errors (no need to push). |
Okay put them all in ignore: https://github.com/embeddings-benchmark/arena/blob/main/.gitignore |
The logs are becoming pretty big & will soon be infeasible to fully have in this repository / require git-lfs. Maybe we move them to an HF dataset? @orionw is probably the expert here, wdyt is the best approach?
The text was updated successfully, but these errors were encountered: