Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add SQL Metrics Implementation #59

Open
wants to merge 8 commits into
base: main
Choose a base branch
from
Open

Conversation

yisz
Copy link
Contributor

@yisz yisz commented May 16, 2024

Pull Request Description

Summary

This pull request introduces a new SQL AST comparison metric to the continuous-eval repository. The new metric, SQLASTSimilarity, compares SQL queries using Abstract Syntax Tree (AST) similarity, leveraging the sqlglot library.

Changes

  • Added the SQLASTSimilarity class to the code_deterministic_metrics.py file.
  • Imported the diff and parse_one functions from the sqlglot library.
  • Imported the Keep class from the sqlglot.diff module.
  • Implemented the __call__ method in the SQLASTSimilarity class to parse SQL queries into ASTs and calculate similarity scores.
  • Implemented the _calculate_similarity method in the SQLASTSimilarity class to calculate the similarity score between two ASTs by using the diff function to get the differences between the trees, counting the total changes, and calculating the total number of nodes in both trees. The similarity score is calculated as 1 - (total_changes / total_nodes).

Testing

  • Created a new test file, test_code_deterministic_metrics.py, with unit tests for the SQLASTSimilarity class.
  • Added test methods to validate the functionality of the SQLASTSimilarity class, including tests for exact match, different queries, similar queries, and invalid queries.
  • Ran the tests using pytest, and all tests passed successfully.

Link to Devin run

https://preview.devin.ai/devin/696032ba45654233968d6a04f2bc5df3

Request for Review

Please review the changes and provide feedback. If everything looks good, kindly approve the pull request for merging.

Thank you!

Copy link
Contributor

@ellipsis-dev ellipsis-dev bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

❌ Changes requested. Reviewed everything up to da8a897 in 3 minutes and 30 seconds

More details
  • Looked at 43 lines of code in 1 files
  • Skipped 1 files when reviewing.
  • Skipped posting 0 drafted comments based on config settings.

Workflow ID: wflow_SbrHuyQL59T55GNV


Want Ellipsis to fix these issues? Tag @ellipsis-dev in a comment. You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet mode, and more.

# Compare the formatted answer with each formatted ground truth answer
for formatted_gt in formatted_ground_truths:
# Simple string comparison for now, can be improved with more sophisticated methods
match_score = float(formatted_answer == formatted_gt)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider implementing a more sophisticated comparison method than simple string equality to handle cases where SQL queries might be functionally identical but differ in formatting or syntax. This could improve the robustness of the syntactic similarity evaluation.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ellipsis-dev come up with a few more sophisticated ways to handle functionally identical cases

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@yisz, I have addressed your comments in pull request #60


You can configure Ellipsis to address comments with a direct commit or a side PR, see docs.

Copy link
Contributor

ellipsis-dev bot commented May 16, 2024

This is a cross repository pull request, but Ellipsis isn't installed in yisz/continuous-eval-locale. In order to have Ellipsis address comments in this PR, you'll need to install Ellipsis in that repository.

@yisz
Copy link
Contributor Author

yisz commented May 17, 2024

@pantonante check to see if Devin's work is good enough. It added tests / documentation as well.

It uses the sqlparse library to format and compare the SQL queries.
"""

def __call__(self, answer: str, ground_truth_answers: Union[List[str], str]):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

@pantonante pantonante left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This implementation is too strict, I think we can do better

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants