Skip to content

Latest commit

 

History

History
26 lines (17 loc) · 486 Bytes

README.md

File metadata and controls

26 lines (17 loc) · 486 Bytes

Simple crawler

差點重工啦 See https://github.com/kiang/landchg.tcd.gov.tw

Setup

pnpm install
pnpm exec playwright install chromium

Run Crawler

# Crawl the raw html first
# You might need to run quite few times if the crawl hangs
# But that's okay, we have progress saved so that won't run same year/city twice
pnpm exec playwright test tests/example.spec.ts --project chromium


# Then convert the raw html into JSON data
pnpm generate