Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

arxiv.Search returns empty result #119

Closed
lilycyf opened this issue Jul 12, 2023 · 9 comments
Closed

arxiv.Search returns empty result #119

lilycyf opened this issue Jul 12, 2023 · 9 comments
Assignees
Labels
api Issues that correspond to arXiv API behavior rather than behavior introduced by this wrapper. bug Deviations from documented behavior.

Comments

@lilycyf
Copy link

lilycyf commented Jul 12, 2023

Description

I'm only able to get a empty result returned simply running, wondering why:

import arxiv

search = arxiv.Search(
  query = "quantum",
  max_results = 10,
  sort_by = arxiv.SortCriterion.SubmittedDate
)

for result in search.results():
  print(result.title)

but, I'm able to get result with the following code:

search = arxiv.Search(id_list=["1605.08386v1"])
paper = next(search.results())
print(paper.title)

Versions

  • python version: 3.9.7
  • arxiv.py version: 1.4.8
@lilycyf lilycyf added the bug Deviations from documented behavior. label Jul 12, 2023
@lilycyf
Copy link
Author

lilycyf commented Jul 12, 2023

arxiv.py version: 1.4.7 works for me

@lukasschwab
Copy link
Owner

@lilycyf thanks for the bug report. I can reproduce. Looking into it!

@lukasschwab
Copy link
Owner

@lilycyf my first guess here is actually that the underlying arXiv API is misbehaving.

Yesterday the test suite passed (locally and in CI) for the commit that'd become 1.4.8. Today the test suite fails for the same commit:

~/Pr/arxiv.py(3d013ab) » make test
pytest
===================================== test session starts ======================================
platform darwin -- Python 3.10.4, pytest-7.3.1, pluggy-1.0.0 -- /Users/lukas/.pyenv/versions/3.10.4/bin/python3.10
cachedir: .pytest_cache
rootdir: /Users/lukas/Programming/arxiv.py
configfile: setup.cfg
collected 24 items

tests/test_api_bugs.py::TestClient::test_missing_title PASSED                            [  4%]
tests/test_client.py::TestClient::test_invalid_format_id PASSED                          [  8%]
tests/test_client.py::TestClient::test_invalid_id PASSED                                 [ 12%]
tests/test_client.py::TestClient::test_max_results FAILED                                [ 16%]
tests/test_client.py::TestClient::test_no_duplicates PASSED                              [ 20%]
tests/test_client.py::TestClient::test_nonexistent_id_in_list PASSED                     [ 25%]
tests/test_client.py::TestClient::test_offset PASSED                                     [ 29%]
tests/test_client.py::TestClient::test_query_page_count FAILED                           [ 33%]
tests/test_client.py::TestClient::test_retry PASSED                                      [ 37%]
tests/test_client.py::TestClient::test_search_results_offset PASSED                      [ 41%]
tests/test_client.py::TestClient::test_sleep_between_errors PASSED                       [ 45%]
tests/test_client.py::TestClient::test_sleep_elapsed PASSED                              [ 50%]
tests/test_client.py::TestClient::test_sleep_multiple_requests PASSED                    [ 54%]
tests/test_client.py::TestClient::test_sleep_standard PASSED                             [ 58%]
tests/test_client.py::TestClient::test_sleep_zero_delay PASSED                           [ 62%]
tests/test_download.py::TestDownload::test_download_from_query PASSED                    [ 66%]
tests/test_download.py::TestDownload::test_download_tarfile_from_query PASSED            [ 70%]
tests/test_download.py::TestDownload::test_download_with_custom_slugify_from_query PASSED [ 75%]
tests/test_result.py::TestResult::test_eq PASSED                                         [ 79%]
tests/test_result.py::TestResult::test_from_feed_entry FAILED                            [ 83%]
tests/test_result.py::TestResult::test_get_short_id PASSED                               [ 87%]
tests/test_result.py::TestResult::test_legacy_ids PASSED                                 [ 91%]
tests/test_result.py::TestResult::test_result_shape FAILED                               [ 95%]
tests/test_result.py::TestResult::test_to_datetime PASSED                                [100%]

The test suite fails in the same way if I run it for tagged version 1.4.7:

~/Pr/arxiv.py(3d013ab) » git checkout 1.4.7                                                 2 ↵
Previous HEAD position was 3d013ab Simplify `pdoc` build, eliminate nav badges (#115)
HEAD is now at 1df844f Indicate Python version in trove classifiers (#112)
~/Pr/arxiv.py(1df844f) » make test
pytest
===================================== test session starts ======================================
platform darwin -- Python 3.10.4, pytest-7.3.1, pluggy-1.0.0 -- /Users/lukas/.pyenv/versions/3.10.4/bin/python3.10
cachedir: .pytest_cache
rootdir: /Users/lukas/Programming/arxiv.py
configfile: setup.cfg
collected 24 items

tests/test_api_bugs.py::TestClient::test_missing_title PASSED                            [  4%]
tests/test_client.py::TestClient::test_invalid_format_id PASSED                          [  8%]
tests/test_client.py::TestClient::test_invalid_id PASSED                                 [ 12%]
tests/test_client.py::TestClient::test_max_results FAILED                                [ 16%]
tests/test_client.py::TestClient::test_no_duplicates PASSED                              [ 20%]
tests/test_client.py::TestClient::test_nonexistent_id_in_list PASSED                     [ 25%]
tests/test_client.py::TestClient::test_offset PASSED                                     [ 29%]
tests/test_client.py::TestClient::test_query_page_count FAILED                           [ 33%]
tests/test_client.py::TestClient::test_retry PASSED                                      [ 37%]
tests/test_client.py::TestClient::test_search_results_offset PASSED                      [ 41%]
tests/test_client.py::TestClient::test_sleep_between_errors PASSED                       [ 45%]
tests/test_client.py::TestClient::test_sleep_elapsed PASSED                              [ 50%]
tests/test_client.py::TestClient::test_sleep_multiple_requests PASSED                    [ 54%]
tests/test_client.py::TestClient::test_sleep_standard PASSED                             [ 58%]
tests/test_client.py::TestClient::test_sleep_zero_delay PASSED                           [ 62%]
tests/test_download.py::TestDownload::test_download_from_query PASSED                    [ 66%]
tests/test_download.py::TestDownload::test_download_tarfile_from_query PASSED            [ 70%]
tests/test_download.py::TestDownload::test_download_with_custom_slugify_from_query PASSED [ 75%]
tests/test_result.py::TestResult::test_eq PASSED                                         [ 79%]
tests/test_result.py::TestResult::test_from_feed_entry FAILED                            [ 83%]
tests/test_result.py::TestResult::test_get_short_id PASSED                               [ 87%]
tests/test_result.py::TestResult::test_legacy_ids PASSED                                 [ 91%]
tests/test_result.py::TestResult::test_result_shape FAILED                               [ 95%]
tests/test_result.py::TestResult::test_to_datetime PASSED                                [100%]

arxiv.py version: 1.4.7 works for me

It does not work for me. Mind sharing some more details on how you tested this? Thanks!

@lukasschwab
Copy link
Owner

@lilycyf while investigating, the test suite started passing again. The example code in your initial issue also works. Version 1.4.8 should work as well for you as 1.4.7.

Thanks again for reporting — seems this was a brief issue on arXiv's side.

@lukasschwab lukasschwab added the api Issues that correspond to arXiv API behavior rather than behavior introduced by this wrapper. label Jul 12, 2023
@Animadversio
Copy link

To add my observation here, it seems to me that the same search code can sometimes work reliably and sometimes yield empty results for some time (5-10mins) and then recover again. This has repeated 3-4 rounds tonight on my side.

It seems like the arxiv api server breaks down or does not answer requests in some periods....

@AllenWrong
Copy link

I also have this issue. But, when I debugging my code, I found that I can not enter the function 'self._result' from

return itertools.islice(self._results(search, offset), limit)
. Is this a problem?

@lukasschwab
Copy link
Owner

@AllenWrong can you share a code snippet that reproduces the issue you're encountering? You shouldn't have to call self._results directly.

The integration tests are stable. These issues are most likely caused by temporary instability in the arXiv API service itself.

@AllenWrong
Copy link

@lukasschwab I solved this by changing query_url_format = "https://export.arxiv.org/api/query?{}" to query_url_format = "http://export.arxiv.org/api/query?{}". It is surprising.

@lukasschwab
Copy link
Owner

@AllenWrong yes, I observed HTTP/HTTPS behavior differences the last time this came up: #129

I think the arXiv folks might need to restart a server. I'll see if I can drop them a line.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api Issues that correspond to arXiv API behavior rather than behavior introduced by this wrapper. bug Deviations from documented behavior.
Projects
None yet
Development

No branches or pull requests

4 participants