Skip to content

Commit

Permalink
updated readme
Browse files Browse the repository at this point in the history
  • Loading branch information
roumail committed Oct 19, 2023
1 parent 8264f37 commit 49210aa
Showing 1 changed file with 5 additions and 3 deletions.
8 changes: 5 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,9 +1,11 @@
# immoweb-scraper
immoweb-scraper is a Python-based tool designed to scrape property listings from the Immoweb website. The idea is to use the scraped data as a way to experiment with modeling exercises and as a way to experiment with different scraping, data engineering workflows. This codebase uses Prefect to schedule regular scraping scraping tasks. The results of the scraping are added to an sqlite database. Eventually, we will add tasks to make modelling views of the collected data and make backups of the database.
immoweb-scraper is a Python-based tool designed to scrape property listings from the Immoweb website. The idea is to use the scraped data as a way to experiment with modeling exercises and as a way to experiment with different scraping, data engineering workflows. The results of the scraping are added to an sqlite database. The dependencies for this project are managed via `poetry`.

## Architecture Overview
## Future plans

We use a python package to separate components for separating concerns into database connections, scraping logic, URL generation, and browser setup. The dependencies for this project are managed via `poetry`.
* Scheduler to run the scraping tasks and accumulate data slowly over time.
* Dockerize the analysis
* Investigate decoupled architecture of microservices


## Usage
Expand Down

0 comments on commit 49210aa

Please sign in to comment.