BAIs get incorrectly added to literature records from author.xml #460
Labels
cold box
When we are waiting for 3rd party or is not possible at the moment
project: next
type: bug
Something isn't working
The code extracting identifiers from author.xml files blindly trusts all identifiers and tries to add them to the author (causing a validation error if it's an unknown id later down the line). This is fine for things like ORCID, but not for INSPIRE BAIs, as they have been removed from literature records and are now supposed to be generated dynamically from the linked author record during serialization: https://github.com/inspirehep/inspirehep/blob/4d514d4a046819ef984defc0435c413f3d90ce10/backend/inspirehep/records/marshmallow/literature/common/author.py#L62-L78.
The consequence is that we have hardcoded BAIs in literature records, which get out of sync with the linked author BAI in case the BAI has changed. Example: https://inspirehep.net/literature?sort=mostrecent&size=25&page=1&q=a%20Michele%20Selvaggi%20and%20a%20M.Selvaggi.1. These should all have BAI
Michele.Selvaggi.1
instead ofM.Selvaggi.1
but don't because of the hardcoding.We should fix the bug and run the script in https://github.com/inspirehep/curation-scripts/blob/master/scripts/remove-bai-from-lit-authors/script.py again to fix existing records.
The text was updated successfully, but these errors were encountered: