Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Author affiliations missing from Result.Authors #62

Open
lukasschwab opened this issue Apr 18, 2021 · 0 comments
Open

Author affiliations missing from Result.Authors #62

lukasschwab opened this issue Apr 18, 2021 · 0 comments
Labels
wontfix Issues that will not be resolved.

Comments

@lukasschwab
Copy link
Owner

lukasschwab commented Apr 18, 2021

Description

A clear and concise description of what the bug is.

Author affiliations are available in raw arXiv API feeds, but are not exposed by this package's Result objects.

Steps to reproduce

Steps to reproduce the behavior; ideally, include a code snippet.

Apparent for any result set.

  • There's no mention of affiliations in this package's documentation or in the source code.
  • (Result)._raw.arxiv_affiliation is often defined, but it's a single string––the affiliation of one author among several.

Expected behavior

A clear and concise description of what you expected to happen.

Author affiliations should be exposed by the Result.Author class.

Versions

  • python version: *
  • arxiv.py version: >= 1.0.0

Additional context

Add any other context about the problem here.

This is a long-open issue in feedparser, perhaps open since 2015: kurtmckee/feedparser#24. There's a detailed breakdown of the interaction with arXiv results here: kurtmckee/feedparser#145 (comment). I suspect arXiv will release their JSON API ––and this client library will be rewritten to use the JSON API––before this feedparser bug is resolved.

This client library could expose the single author affiliation extracted by feedparser, but this has negative impacts:

  • It may misleadingly suggest that a certain author or institution led the publication in question, which sucks from an ethical perspective.
  • Which affiliation is extracted may depend on the order of the authors, which arXiv may not guarantee. The extracted affiliation of a paper may vary.
  • The affiliation may not apply to all of the authors for a paper; exposing it is misleading.

If the single author affiliation is useful in your application, despite the noted downsides, access it with (Result)._raw.get('arxiv_affiliation').

@lukasschwab lukasschwab added the bug Deviations from documented behavior. label Apr 18, 2021
@lukasschwab lukasschwab self-assigned this Apr 18, 2021
@lukasschwab lukasschwab added wontfix Issues that will not be resolved. and removed bug Deviations from documented behavior. labels Apr 18, 2021
@lukasschwab lukasschwab removed their assignment May 2, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
wontfix Issues that will not be resolved.
Projects
None yet
Development

No branches or pull requests

1 participant