English (US) | Português (BR)
Querido Diário has a Guide for Contributing that is relevant to all of its repositories. The guide provides general information about how to interact with the project, the code of conduct you adhere to when contributing, the list of ecosystem repositories and the first actions you can take. We recommend reading it before continuing.
Already read? So let's go to the specific information of this repository:
The main challenge of this repository is to have more and more scrapers from websites that publish official gazettes, aiming to reach the 5570 Brazilian municipalities. We use the City Expansion Board to organize this challenge progress. Consult it to find relevant tasks you can contribute to.
To help you develop, use the guidelines on the page about how to write a new scraper available at Querido Diario's technical documentation.
Scrapers are developed using Python and Scrapy framework. You can check how to install Python on your operating system and learn more about Scrapy in this tutorial. With Python on your computer, follow the development environment setup step-by-step:
- Create a fork of this repository and, with a terminal open in a preferred directory on your computer, clone it and access the new directory created with the name of the repository.
git clone <repository_fork>
cd querido-diario
- Create a new virtual environment which will keep the project isolated from your system.
python3 -m venv .venv
- Activate the newly created virtual environment
source .venv/bin/activate
- Install the required libraries.
pip install -r data_collection/requirements-dev.txt
- Install pre-commit, a tool that verifies if code attends project standards when committing.
pre-commit install
- Your development environment is ready! 🎉
Attention: These steps need to be executed only the first time you interact with the project during the environment setup. After that, just activate the virtual environment (step 3) every time you use or contribute to the repository.
The following instructions were tried on Windows 10.
- Install Microsoft Visual Build Tools. When starting the installation, you need to select
C++ build tools
in the loading tab and alsoWindows 10 SDK
andMSVC v142 - VS 2019 C++ x64/x86 build tools
in the individual components tab. - Follow all steps used in Linux, except for item 3. In it, the command should be:
.venv/Scripts/activate.bat
Note: In Windows commands, the direction of the slash (/
or \
) may vary depending on the use of WSL.
Project uses Black as an automated tool to format and check code style and isort to sort the imports. CI will fail if your code are not correctly formatted according these tools.
If you followed the setup instructions, installing pre-commit hooks, it is possible that you will never need to run these tools manually, as they will be execute before each commit. However, if you want to run them in all files in the project, you have make format
command that will call these tools.
Maintainers must follow the guidelines in Querido Diário's Guide for Maintainers.