A simple Python script that scrapes web pages for PDF files and downloads them to a local directory.
- Clone this repository.
- Install Python.
- Install Pip.
- Install the required packages using
pip install -r requirements.txt
in your terminal. - Place the web page URL and output file location in the
main.py
file here:
# Define your URL
url = "https://yourWebsiteURL"
# By default, the script will download PDF files to the downloads folder.
# You can change the folder location by updating the folder_location variable.
# Example: folder_location = r'/Users/yourname/Documents'
folder_location = r'./downloads'
- Run the script:
python main.py
- PDF files will be downloaded to your local directory.
Important
This tool is not intended to break copyright laws and is for personal use only. It merely automates the retrieval of publicly available data using standard web scraping techniques. The copyright of the data retrieved belongs to its respective owners, and I am not responsible for any illegal redistribution or misuse of data obtained using this tool.
Caution
Use of this tool is at your own risk. By using this tool, you agree that you are solely responsible for any legal issues that may arise from your use of this tool.
This project is released under the terms of The Unlicense, which allows you to use, modify, and distribute the code as you see fit.
- The Unlicense removes traditional copyright restrictions, giving you the freedom to use the code in any way you choose.
- For more details, see the LICENSE file in this repository.
Author: Scott Grivner
Email: scott.grivner@gmail.com
Website: scottgrivner.dev
Reference: Main Branch