WARNING: we just (2018-09-23 15:40 UTC) push-forced a new history without the database in order to reduce the size of this repo. See #5.
ghuser.io's database scripts
This repository provides scripts to update the database for the
ghuser.io Reframe app. The database consists of JSON
files. The production data is stored on
AWS. The scripts expect it at ~/data
.
This can be changed here.
The fetchBot calls these scripts. It runs daily on an EC2 instance.
API keys can be created here.
$ npm install
Start tracking a user
$ ./addUser.js USER
Stop tracking a user
$ ./rmUser.js USER "you asked us to remove your profile in https://github.com/ghuser-io/ghuser.io/issues/666"
Refresh and clean data for all tracked users
$ export GITHUB_CLIENT_ID=0123456789abcdef0123
$ export GITHUB_CLIENT_SECRET=0123456789abcdef0123456789abcdef01234567
$ export GITHUB_USERNAME=AurelienLourot
$ export GITHUB_PASSWORD=********
$ ./fetchAndCalculateAll.sh
GitHub API key found.
GitHub credentials found.
...
/home/ubuntu/data/users
921 users
largest: tarsius.json (22 KB)
total: 2057 KB
/home/ubuntu/data/contribs
largest: tarsius.json (113 KB)
total: 5 MB
/home/ubuntu/data/repos
41502 repos
25616 significant repos
largest: jlord/patchwork.json (708 KB)
total: 86 MB
/home/ubuntu/data/repoCommits
largest: CocoaPods/Specs.json (3928 KB)
total: 211 MB
/home/ubuntu/data/orgs.json: 1900 KB
/home/ubuntu/data/nonOrgs.json: 104 KB
/home/ubuntu/data/meta.json: 48 B
total: 306 MB
=> 341 KB/user
real 150m37.837s
user 10m51.248s
sys 0m50.716s
Several scripts form a pipeline for updating the database. Here is the data flow:
[ ./addUser.js myUser ] [ ./rmUser.js myUser ]
│ │
v v
┌───────────────────┐
│ users/myuser.json │<───────────┐
└────────────────┬──┘ │─┐ │
└──────────────│────┘ │ │ ╔════════╗
└────┬───────│──────┘ │ ║ GitHub ║
│ │ │ ╚════╤═══╝
│ v │ │
│ [ ./fetchUserDetailsAndContribs.js myUser ]<──┤
│ │
├───────────────────────>[ ./fetchOrgs.js ]<──────┤
│ ^ │ │
│ │ │ │
│ v v │
│ ┌──────────────┐ ┌───────────┐ │
│ │ nonOrgs.json │ │ orgs.json │ │
│ └──────────────┘ └───┬───────┘ │
│ │ │
├──>[ ./fetchRepos.js ]<──────────────────────────┘
│ │ │
│ v │
│ ┌───────────────────────────┐ │
│ │ repo*/myOwner/myRepo.json │─┐ │
│ └───────────────────────────┘ │─┐ │
│ └───────────────────────────┘ │ │
│ └────┬──────────────────────┘ │
│ │ │
│ │ ┌───────────────┘
│ │ │
v v v
[ ./calculateContribsAndMeta.js ]
│ │
v v
┌──────────────────────┐ ┌───────────┐
│ contribs/myuser.json │─┐ │ meta.json │
└──────────────────────┘ │─┐ └───────────┘
└──────────────────────┘ │
└──────────────────────┘
NOTES:
- These scripts also delete unreferenced data.
- Instead of calling each of these scripts directly, you can call
./fetchAndCalculateAll.sh
which will orchestrate them.
Thanks goes to these wonderful people (emoji key):
Aurelien Lourot 💬 💻 📖 👀 |
Charles 💻 📖 🤔 |
Romuald Brillout 🤔 |
---|
This project follows the all-contributors specification. Contributions of any kind welcome!