Skip to content

simoistgray/ResearchPaperPDFs

Repository files navigation

Research Paper PDF's

Overview

This was work for a comission where the client wanted a program to scrape through volumes of research papers and extract information from their PDF's. This information included authors, publication dates, abstracts, institutions, etc.

Features

  • Scrapes metadata (title, volume, issue, year) of research papers from JSTOR.
  • Extracts and stores abstracts from research papers.
  • Downloads PDF files associated with each paper.
  • Automatically handles different browsing sessions (e.g., Chrome, Edge) to dodge bot detection.
  • Saves gathered papers and abstracts in a pickle.
  • Allows for resuming and updating the scraping process.

License

This project is open-source under the MIT License.

Author

Simon Gray

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages