Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Roadmap for v0.5.0 #79

Closed
45 tasks done
zimmski opened this issue Apr 26, 2024 · 7 comments
Closed
45 tasks done

Roadmap for v0.5.0 #79

zimmski opened this issue Apr 26, 2024 · 7 comments
Assignees
Labels
roadmap Collection of issues for a release
Milestone

Comments

@zimmski
Copy link
Member

zimmski commented Apr 26, 2024

The v0.5.0 is mainly meant for introducing more variate. There are three main goals

  1. Introduce more logical cases, to make sure that "better models" have a bigger difference in score.
  2. Introduce more providers so we can test models that have been request and react faster to new releases.

Tasks:

@zimmski zimmski added the enhancement New feature or request label Apr 26, 2024
@zimmski zimmski added this to the v0.5.0 milestone Apr 26, 2024
@zimmski zimmski self-assigned this Apr 26, 2024
@zimmski
Copy link
Member Author

zimmski commented Apr 26, 2024

CC @bauersimon

@zimmski zimmski mentioned this issue Apr 26, 2024
30 tasks
@bauersimon
Copy link
Member

Add an app-name to the requests so people know we are the eval https://openrouter.ai/docs#quick-start shows that other openapi-packages implement custom headers, but the one Go package we are using does not implement that. So do a PR to contribute.

seems like a PR does not make sense

@bauersimon
Copy link
Member

Blogpost idea: misleading comments... how much does it take to confuse the most powerful AI? (credit to @ahumenberger)

@ahumenberger
Copy link
Contributor

Blogpost idea: misleading comments... how much does it take to confuse the most powerful AI? (credit to @ahumenberger)

Maybe not only comments. What about obfuscated code, e.g. function and variables names are just random strings?

@zimmski
Copy link
Member Author

zimmski commented Jun 6, 2024

Take a look at https://x.com/dottxtai/status/1798443290913853770

@bauersimon
Copy link
Member

Looking through logs... Java consistently has more code than Go for the same tasks, which yields more coverage. So a model that solves all Java tasks but no Go is automatically higher ranked than the opposite.

@zimmski zimmski added roadmap Collection of issues for a release and removed enhancement New feature or request labels Jun 17, 2024
bauersimon added a commit that referenced this issue Jul 30, 2024
@Munsio
Copy link
Contributor

Munsio commented Jul 30, 2024

Closed with #297

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
roadmap Collection of issues for a release
Projects
None yet
Development

No branches or pull requests

4 participants