Skip to content

Refactor Github Scraper Python to TypeScript (GSoC 2024 Mid-Term Evaluation) #45

Refactor Github Scraper Python to TypeScript (GSoC 2024 Mid-Term Evaluation)

Refactor Github Scraper Python to TypeScript (GSoC 2024 Mid-Term Evaluation) #45

name: Scraper Dry Run & Validate Schema
on:
workflow_dispatch:
pull_request:
paths:
- scraper/**
- tests/**
- schemas/**
jobs:
test-run-github-scraper:
name: Test run GitHub Scraper
runs-on: ubuntu-latest
permissions:
issues: read
pull-requests: read
env:
DATA_REPO: ./
steps:
- uses: actions/checkout@v4
- name: setup python
uses: actions/setup-python@v3
with:
python-version: "3.10"
- name: Install dependencies
run: pip install -r scraper/requirements.txt
- name: Scrape data from GitHub
run: python scraper/src/github.py ${{ github.repository_owner }} data/github -l DEBUG
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
- run: mkdir contributors
- name: Generate markdown files for new contributors
run: node scripts/generateNewContributors.js
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
- uses: actions/upload-artifact@v4
with:
name: output
retention-days: 5
path: |
data
contributors
- name: Setup pnpm
uses: pnpm/action-setup@v4
with:
version: 9
run_install: false
- name: Setup Node.js environment
uses: actions/setup-node@v4
with:
node-version: 20
cache: "pnpm"
- name: Get pnpm store directory
shell: bash
run: |
echo "STORE_PATH=$(pnpm store path --silent)" >> $GITHUB_ENV
- uses: actions/cache@v4
name: Setup pnpm cache
with:
path: ${{ env.STORE_PATH }}
key: ${{ runner.os }}-pnpm-store-${{ hashFiles('**/pnpm-lock.yaml') }}
restore-keys: |
${{ runner.os }}-pnpm-store-
- name: Install dependencies
run: pnpm install --frozen-lockfile
- name: Run Tests
run: pnpm test