Skip to content

Data scrape of annual Resumes of Congressional Activity from the US Congress

Notifications You must be signed in to change notification settings

tamimcm416/congressional_data_scrape

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

78 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Congressional Data Scrape: 98th through 117th Congresses

This project scrapes data from the annual Resume of Congressional Activity, which are avaiable in PDF format.


Data Sources

Source Description Link
US Senate Web Site Resumes of Congressional Activity, Session Dates US Senate Web Site

Technical Details

Data was scraped using tabula, formatted in Excel using VBA, and tidied in Jupyter Lab using Python.

Tool / Library Version
Adobe Acrobat Pro 2024.001.20629
JupyterLab 4.1.2
Microsoft Office 365, Excel 2403
Microsft Visual Basic for Applications 7.1
Python 3.12.2
tabula 1.2.1

File Descriptions

Name Description
data Folder containing original data files and scrubbed output
code Folder containing Jupyter notebooks and VBA exports
documentation Folder containing test results and data integrity issues
Data Scrape and Validation Presenatation Power Point recap of project and findings, saved as PDF

Licenses

Asset License / Use Policy
Original Code MIT License
Congressional Activity Federal Open Data Policy

About

Data scrape of annual Resumes of Congressional Activity from the US Congress

Topics

Resources

Stars

Watchers

Forks