This is a Singer tap that produces JSON-formatted data from the GitHub API following the Singer spec.
This tap:
- Pulls raw data from the GitHub REST API
- Extracts the following resources from GitHub for a single repository:
- Outputs the schema for each resource
- Incrementally pulls data based on the input state
-
Install
We recommend using a virtualenv:
> virtualenv -p python3 venv > source venv/bin/activate > pip install tap-github
-
Create a GitHub access token
Login to your GitHub account, go to the Personal Access Tokens settings page, and generate a new token with at least the
repo
scope. Save this access token, you'll need it for the next step. -
Create the config file
Create a JSON file containing the access token you just created and the path to the repository. The repo path is relative to
https://github.com/
. For example the path for this repository issinger-io/tap-github
.{"access_token": "your-access-token", "repository": "singer-io/tap-github"}
-
Run the tap in discovery mode to get properties.json file
tap-github --config config.json --discover > properties.json
-
In the properties.json file, select the streams to sync
Each stream in the properties.json file has a "schema" entry. To select a stream to sync, add
"selected": true
to that stream's "schema" entry. For example, to sync the pull_requests stream:... "tap_stream_id": "pull_requests", "schema": { "selected": true, "properties": { "updated_at": { "format": "date-time", "type": [ "null", "string" ] } ...
-
Run the application
tap-github
can be run with:tap-github --config config.json --properties properties.json
Copyright © 2018 Stitch