This script automatically downloads public comments (and attachments) that were submitted in response to U.S. government Requests for Information (RFIs). The script downloads all comments and attachments for any docket on the U.S. government's regulations.gov website. This is a particularly useful tool for anyone who wants to analyze RFIs that receive hundreds or thousands of comments (and attachments). A similar project (excellently written) already exists, but my project adds two key functions:
- It keeps track of which comments/attachments you've already downloaded. So if you exceed your API limit (1,000 calls per hour) in the middle of downloading a docket, it will save your partial progress. All you need to do is wait an hour for your API calls to reset and then run the script again! It'll identify which comments you've already downloaded and skip over those, preserving your API calls for the comments that you haven't yet downloaded.
- It downloads all attachments associated with the RFI. The other project does obtain the links to each attachment but doesn't actually download them for you. This is an issue if (like me) you want to create a folder on your computer where you can sift through the comments and attachments in an organized way; you need to download the attachments, not just find the links.
So how exactly does this script work? It's pretty simple!
- You provide the script with a docket ID from regulations.gov (ex: BIS-2021-0036). The script will create a folder for that docket ID; everything that the script downloads will go in this folder.
- The script will identify every document and comment associated with the provided docket, and it will store the most important information in a handy CSV file.
- The script will create a sub-folder for each comment/document that had at least one attachment. The script will then automatically download each attachment and place it in the assigned sub-folder.
To see what the output looks like, check out the two examples in the "Examples" folder.
- Ensure that you've installed the "pandas" package. If you do need to install pandas, you can run following command in your command-line:
pip install pandas
- Download the script:
regulations_comments_downloader.py
- Obtain an API key on the data.gov website.
- Replace two values in the script (near the top of the file):
- Set API key: Replace
DEMO_KEY
with your personal API key - Set download directory: Replace
/path/to/folder/
with the path to the directory where you'd like the script to download files. The script will take care of organizing the files!
- Set API key: Replace
- In your command-line, navigate to the folder where you downloaded/edited the script.
- Run the script, and when you do so, include as an argument the docket ID that you want to download. Ex:
python3 regulations_comments_downloader.py "BIS-2021-0036"
- "Filename too long" Error in Windows when cloning this Git repository. Follow these directions.
- New to Python? Here's how to run a Python script on Windows and Mac.
Note: as mentioned above, if you exceed your API limit (1,000 calls per hour) in the middle of downloading a docket, the script will save your partial progress. All you need to do is wait an hour for your API calls to reset and then run the script again! It'll identify which comments you've already downloaded and skip over those, preserving your API calls for the comments that you haven't yet downloaded.
MIT License- Copyright (c) 2021 Jacob A. Feldgoise