Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Malformed EUPMC query doesn't terminate, nor give error message #143

Open
rossmounce opened this issue Dec 6, 2016 · 5 comments
Open

Malformed EUPMC query doesn't terminate, nor give error message #143

rossmounce opened this issue Dec 6, 2016 · 5 comments

Comments

@rossmounce
Copy link
Member

Perhaps related to #114 I noted recently that a subtly malformed EUPMC query gets stuck at "Searching using eupmc API" without terminating (ever), or giving an error message. This could lead some users to wait 10 minutes assuming it's loading a big query.

The user error causing this is missing the colon. It should be (FIRST_PDATE:[2013-01-01 TO 2016-12-05]) but instead (FIRST_PDATE[2013-01-01 TO 2016-12-05]) is given.

It would be nice (although perhaps hard), if this could be detected & some kind of informative error message supplied.

getpapers -V
0.4.10

getpapers -q '"arbuscular mycorrhizae" (FIRST_PDATE[2013-01-01 TO 2016-12-05])' -o malformedquery
info: Searching using eupmc API
@blahah
Copy link
Member

blahah commented Dec 6, 2016

I think we have to rely on the eupmc to tell us that a query is malformed. It's possible that they already do this and we aren't handling the response properly.

@tarrow
Copy link
Contributor

tarrow commented Dec 6, 2016

How long did it run for?

Basically the problem we have/had is that (with a changing likelihood of it happening) EuPMC sometimes responds with an error page or simply no results when there are actually results to be delivered.

For this reason we now very aggressively retry until we get a successful response but not forever. I would expect it to eventually fail; probably in under 10mins but perhaps longer.

I'd like to retry less aggressively but I was going to wait and see if we have a few more months of relative stability before I turn it down.

@rossmounce
Copy link
Member Author

rossmounce commented Dec 6, 2016

My impression was that it would run indefinitely. But I am running it again now with time to see if it does in fact terminate at some point...

@rossmounce
Copy link
Member Author

Apologies, it seems I was being impatient:

time getpapers -q '"arbuscular mycorrhizae" (FIRST_PDATE[2013-01-01 TO 2016-12-05])' -o malformedquery
info: Searching using eupmc API
warn: We had to retry the last request 50 times.
error: Malformed or empty response from EuropePMC. Try running again. Perhaps your query is wrong.

real	4m11.392s
user	0m1.376s
sys	0m0.104s

@tarrow
Copy link
Contributor

tarrow commented Jun 23, 2017

I'll now knock this down to 5 times.

This is now causing a problem itself because EuPMC are now reporting incorrect number of results even in one page. e.g. say there are 10 results. Only return 9 and then we retry forever trying to get that 'last' one that isn't there.

I reported this bug to them before but obviously it has reoccurred.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants