Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Add SemanticScholarToolkits to integrate Semantic Scholar to camel #1493

Merged
merged 22 commits into from
Feb 6, 2025

Conversation

renxinxing123
Copy link
Contributor

Description

This PR introduces a new toolkit called SemanticScholarToolkits to integrate Semantic Scholar into CAMEL. It provides several functionalities, including searching for papers by paper ID, paper title, and keywords, and retrieving recommended papers based on a given paper ID as well as searching author by author ID. Howerver, although Semantic Scholar API is able to search dataset, this feature itself is currently non-responsive based on my testing.

Motivation and Context

Integrating Semantic Scholar into CAMEL enhances its ability to access and process academic papers and resources. This integration will make CAMEL more versatile for research-related tasks by leveraging the rich academic resources and powerful search capabilities of Semantic Scholar. This change addresses the feature request in issue #1032.
[ ] I have raised an issue to propose this change (#1032)

Types of changes

[ ] Bug fix (non-breaking change which fixes an issue)
[x] New feature (non-breaking change which adds core functionality)
[ ] Breaking change (fix or feature that would cause existing functionality to change)
[ ] Documentation (update in the documentation)
[ ] Example (update in the folder of example)

Implemented Tasks

[x] Implement search paper by paper ID
[x] Implement search paper by paper title
[x] Implement search papers by keywords
[x] Implement retrieve recommended papers by paper ID
[x] Implement search author by author ID

Checklist

[x] I have read the CONTRIBUTION guide. (required)
[ ] My change requires a change to the documentation.
[x] I have updated the tests accordingly. (required for a bug fix or a new feature)
[ ] I have updated the documentation accordingly.

@Wendong-Fan Wendong-Fan changed the title Add SemanticScholarToolkits to integrate Semantic Scholar to camel feat: Add SemanticScholarToolkits to integrate Semantic Scholar to camel Jan 23, 2025
@Wendong-Fan Wendong-Fan added this to the Sprint 21 milestone Jan 23, 2025
@Wendong-Fan Wendong-Fan linked an issue Jan 23, 2025 that may be closed by this pull request
2 tasks
import json

class SemanticScholarToolkit(BaseToolkit):
"""A toolkit for interacting with the Semantic Scholar API to fetch paper and author data."""
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
"""A toolkit for interacting with the Semantic Scholar API to fetch paper and author data."""
r"""A toolkit for interacting with the Semantic Scholar API to fetch paper and author data."""

"""A toolkit for interacting with the Semantic Scholar API to fetch paper and author data."""

def __init__(self):
"""Initializes the SemanticScholarToolkit."""
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
"""Initializes the SemanticScholarToolkit."""
r"""Initializes the SemanticScholarToolkit."""

papers = response.json().get("recommendedPapers", [])
papers.sort(key=lambda paper: paper["citationCount"], reverse=True)
with open('recommended_papers_sorted.json', 'w') as output:
json.dump(papers, output)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shall we make writing to a local JSON file optional?

Copy link
Collaborator

@harryeqs harryeqs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @renxinxing123 ! Please run and fix the pre-commit test, and make sure that the code and docstring are formatted correctly. Could you also add the unit test and examples under the test and examples folder please? Thanks a lot!

@renxinxing123
Copy link
Contributor Author

Thanks @renxinxing123 ! Please run and fix the pre-commit test, and make sure that the code and docstring are formatted correctly. Could you also add the unit test and examples under the test and examples folder please? Thanks a lot!

Many thanks for you comment @harryeqs! I will fix the error.

@harryeqs
Copy link
Collaborator

Thanks @renxinxing123 ! Please run and fix the pre-commit test, and make sure that the code and docstring are formatted correctly. Could you also add the unit test and examples under the test and examples folder please? Thanks a lot!

Many thanks for you comment @harryeqs! I will fix the error.

Thanks! It seems there are some remaining formatting problems. You could run the pre-commit test locally using the commands.

# Install camel from source
poetry install --with dev,docs -E all  # (Suggested for developers, needed to pass all tests)

# The following command installs a pre-commit hook into the local git repo,
# so every commit gets auto-formatted and linted.
pre-commit install

# Run camel's pre-commit before push
pre-commit run --all-files

For other contributing guidelines please refer to: https://github.com/camel-ai/camel/blob/master/CONTRIBUTING.md

@renxinxing123
Copy link
Contributor Author

Thanks @renxinxing123 ! Please run and fix the pre-commit test, and make sure that the code and docstring are formatted correctly. Could you also add the unit test and examples under the test and examples folder please? Thanks a lot!

Many thanks for you comment @harryeqs! I will fix the error.

Thanks! It seems there are some remaining formatting problems. You could run the pre-commit test locally using the commands.

# Install camel from source
poetry install --with dev,docs -E all  # (Suggested for developers, needed to pass all tests)

# The following command installs a pre-commit hook into the local git repo,
# so every commit gets auto-formatted and linted.
pre-commit install

# Run camel's pre-commit before push
pre-commit run --all-files

For other contributing guidelines please refer to: https://github.com/camel-ai/camel/blob/master/CONTRIBUTING.md

Thank you, @harryeqs! I followed your suggestion, reformatted the Semantic Scholar toolkits, and added the related test and example files. All files passed the pre-commit tests on my local machine, but it seems that several tests didn’t pass during the PR. However, the error messages in these tests are unrelated to the newly added files.

@harryeqs
Copy link
Collaborator

Thanks @renxinxing123 ! Please run and fix the pre-commit test, and make sure that the code and docstring are formatted correctly. Could you also add the unit test and examples under the test and examples folder please? Thanks a lot!

Many thanks for you comment @harryeqs! I will fix the error.

Thanks! It seems there are some remaining formatting problems. You could run the pre-commit test locally using the commands.

# Install camel from source
poetry install --with dev,docs -E all  # (Suggested for developers, needed to pass all tests)

# The following command installs a pre-commit hook into the local git repo,
# so every commit gets auto-formatted and linted.
pre-commit install

# Run camel's pre-commit before push
pre-commit run --all-files

For other contributing guidelines please refer to: https://github.com/camel-ai/camel/blob/master/CONTRIBUTING.md

Thank you, @harryeqs! I followed your suggestion, reformatted the Semantic Scholar toolkits, and added the related test and example files. All files passed the pre-commit tests on my local machine, but it seems that several tests didn’t pass during the PR. However, the error messages in these tests are unrelated to the newly added files.

Happy Chinese New Year! Thank you very much for the contribution @renxinxing123 . Sorry for getting back quite late as I was working on different tasks in the past few days.
The only thing I can think of for improvement is to make the json file writing optional (or remove it) since the data is already present in the returned dictionary. All the other part looks good to me. Thanks!

"""
url = f"{self.base_url}/paper/search"
query_params = {"query": paperTitle, "fields": fields}
response = requests.get(url, params=query_params)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might be better if we implement error handling here

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @AveryYay ! Thanks for your suggestion! I noticed that the code already includes error handling for the case where the response status code is not 200, and it returns an error message accordingly.

Could you clarify if you're suggesting a different type of error handling?

Copy link
Collaborator

@AveryYay AveryYay Feb 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Checking the status code works if requests.get() successfully returns. Adding try-except could prevents crashes if the request fails due to connectivity problems. There could also be some case where the response isn't a valid JSON.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for your explanation @AveryYay ! I've updated the SemanticScholarToolkit to improve error handling, adding support for request failures and invalid JSON responses. And the corresponding test file has been updated to correctly mock and validate error responses.

@renxinxing123
Copy link
Contributor Author

Thanks @renxinxing123 ! Please run and fix the pre-commit test, and make sure that the code and docstring are formatted correctly. Could you also add the unit test and examples under the test and examples folder please? Thanks a lot!

Many thanks for you comment @harryeqs! I will fix the error.

Thanks! It seems there are some remaining formatting problems. You could run the pre-commit test locally using the commands.

# Install camel from source
poetry install --with dev,docs -E all  # (Suggested for developers, needed to pass all tests)

# The following command installs a pre-commit hook into the local git repo,
# so every commit gets auto-formatted and linted.
pre-commit install

# Run camel's pre-commit before push
pre-commit run --all-files

For other contributing guidelines please refer to: https://github.com/camel-ai/camel/blob/master/CONTRIBUTING.md

Thank you, @harryeqs! I followed your suggestion, reformatted the Semantic Scholar toolkits, and added the related test and example files. All files passed the pre-commit tests on my local machine, but it seems that several tests didn’t pass during the PR. However, the error messages in these tests are unrelated to the newly added files.

Happy Chinese New Year! Thank you very much for the contribution @renxinxing123 . Sorry for getting back quite late as I was working on different tasks in the past few days. The only thing I can think of for improvement is to make the json file writing optional (or remove it) since the data is already present in the returned dictionary. All the other part looks good to me. Thanks!

Happy Chinese New Year @harryeqs ! It doesn't matter, hope you enjoyed a good time with your family and the ones you love! The generation of the json file has been revised as an option, where is set as False as default, while user can activate it using the promt such as 'Please search the information of 'author xx', and save it in a json file'.

Copy link
Member

@Wendong-Fan Wendong-Fan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, @renxinxing123! Overall, it looks great. The format could be improved further. I'll merge this PR and then create another enhancement PR for it. Next time, please create the branch directly in the camel repo for easier access by our core members. Thanks again for your contribution!

enhance PR based on review comment: https://github.com/camel-ai/camel/pull/1562/files feel free to check!

Comment on lines +25 to +26
"""A toolkit for interacting with the Semantic Scholar
API to fetch paper and author data."""
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

docstring format

Suggested change
"""A toolkit for interacting with the Semantic Scholar
API to fetch paper and author data."""
r"""A toolkit for interacting with the Semantic Scholar
API to fetch paper and author data.
"""

Comment on lines +35 to +36
fields: str = """title,abstract,authors,year,citationCount,
publicationTypes,publicationDate,openAccessPdf""",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we could have better way for the fields input

Comment on lines +85 to +86
dict: The response data from the API or error information
if the request fails.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

docstring format

Suggested change
dict: The response data from the API or error information
if the request fails.
dict: The response data from the API or error information
if the request fails.

@Wendong-Fan Wendong-Fan merged commit b35145a into camel-ai:master Feb 6, 2025
1 of 6 checks passed
Copy link
Member

@lightaime lightaime left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @renxinxing123! The PR looks great. Just leave a couple comments on the remove BaseMessage since we are depreciate it.

Also do we need an API key for this?

@Wendong-Fan
Copy link
Member

Thanks @renxinxing123! The PR looks great. Just leave a couple comments on the remove BaseMessage since we are depreciate it.

Also do we need an API key for this?

hey @lightaime , it doesn't require api key

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: No status
Development

Successfully merging this pull request may close these issues.

[Feature Request] Integrate Semantic scholar
5 participants