This script reads a list of DOIs from list.txt
, fetches metadata from CrossRef API, and checks if those papers exist in Poseidon Archives (community-archive
, aadr-archive
, minotaur-archive
). It then generates an HTML table (index.html
) displaying:
✔ Paper title
✔ Publication year & exact date
✔ First author’s name
✔ Journal name
✔ Availability in Poseidon archives (✔ or ✘)
✔ A search bar for filtering by title
✔ Dropdown filters for the archives
Every time list.txt
is updated and a commit is pushed, this script runs and updates index.html
on GitHub Pages.
- Python Version: Python 3.x
- Required Libraries:
pip install requests Jinja2
- Files:
list.txt
→ List of DOIs (one per line)base_script.py
→ The main scriptindex.html
→ The generated output file
Fetches metadata from CrossRef API.
Extracts title, year, journal, date, first author’s name.
Formats publication date into YYYY-MM-DD.
Prints progress updates like:
(1 / 100) Querying metadata for 10.1002/ajpa.23312
Calls Poseidon API to check available DOIs for a given archive.
Extracts DOI list from community-archive
, aadr-archive
, and minotaur-archive
.
Prints status messages while fetching:
Fetching DOI data from community-archive...
Collects all available DOIs from all Poseidon archives.
Stores data in a dictionary mapping DOIs → available archives.
Cleans up DOI format by removing extra spaces & "https://doi.org/".
Checks list.txt
for duplicate DOIs.
If duplicates are found, it prints a warning:
WARNING: Duplicate DOIs found:
- 10.1002/ajpa.23312
Creates index.html using a Jinja2 template.
Adds search bar to filter by title.
Adds dropdown filters to show/hide papers based on Poseidon archive availability.
Formats clickable DOI links like this:
<a href="https://doi.org/10.1002/ajpa.23312">10.1002/ajpa.23312</a>
Prints progress while updating:
Updating index.html...
index.html successfully updated!
- Add DOIs to
list.txt
(one per line). - Run the script:
python base_script.py
- Open
index.html
to see the results!
This is a fully automated workflow that updates the table and deploys it to GitHub Pages whenever input.txt
changes.
GitHub Actions Workflow runs everything behind the scenes. No manual updates needed!