-
Notifications
You must be signed in to change notification settings - Fork 4
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
Showing
22 changed files
with
757 additions
and
729 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -17,7 +17,13 @@ Before starting, download the `superhero.sqlite` database file from [BIRD](https | |
Run the whole suite on the `superhero` database with `gpt-3.5-turbo`: | ||
|
||
```bash | ||
python bench.py --multirun setup=iql-view,sql-view,collection data=superhero | ||
python bench.py --multirun setup=iql-view,sql-view,collection | ||
``` | ||
|
||
Run on multiple databases: | ||
|
||
```bash | ||
python bench.py setup=sql-view setup/views/[email protected]='[superhero,...]' data=bird | ||
``` | ||
|
||
You can also run each evaluation separately or in subgroups: | ||
|
@@ -34,7 +40,7 @@ python bench.py --multirun setup=iql-view setup/llm=gpt-3.5-turbo,claude-3.5-son | |
python bench.py --multirun setup=sql-view setup/llm=gpt-3.5-turbo,claude-3.5-sonnet | ||
``` | ||
|
||
For the `collection` steup, you need to specify models for both the view selection and the IQL generation step: | ||
For the `collection` setup, you need to specify models for both the view selection and the IQL generation step: | ||
|
||
```bash | ||
python bench.py --multirun \ | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,17 +1,31 @@ | ||
from .base import Metric, MetricSet | ||
from .iql import ExactMatchAggregationIQL, ExactMatchFiltersIQL, ExactMatchIQL, UnsupportedIQL, ValidIQL | ||
from .selector import ViewSelectionAccuracy | ||
from .sql import ExactMatchSQL, ExecutionAccuracy | ||
from .iql import ( | ||
FilteringAccuracy, | ||
FilteringPrecision, | ||
FilteringRecall, | ||
IQLFiltersAccuracy, | ||
IQLFiltersCorrectness, | ||
IQLFiltersParseability, | ||
IQLFiltersPrecision, | ||
IQLFiltersRecall, | ||
) | ||
from .selector import ViewSelectionAccuracy, ViewSelectionPrecision, ViewSelectionRecall | ||
from .sql import ExecutionAccuracy, SQLExactMatch | ||
|
||
__all__ = [ | ||
"Metric", | ||
"MetricSet", | ||
"ExactMatchSQL", | ||
"ExactMatchIQL", | ||
"ExactMatchFiltersIQL", | ||
"ExactMatchAggregationIQL", | ||
"ValidIQL", | ||
"FilteringAccuracy", | ||
"FilteringPrecision", | ||
"FilteringRecall", | ||
"IQLFiltersAccuracy", | ||
"IQLFiltersPrecision", | ||
"IQLFiltersRecall", | ||
"IQLFiltersParseability", | ||
"IQLFiltersCorrectness", | ||
"SQLExactMatch", | ||
"ViewSelectionAccuracy", | ||
"UnsupportedIQL", | ||
"ViewSelectionPrecision", | ||
"ViewSelectionRecall", | ||
"ExecutionAccuracy", | ||
] |
Oops, something went wrong.