Skip to content

project-polymorph/web_downloader

Repository files navigation

downloader

This is part of the chinese transgender digital archive project.

Scripts and results for searching and downloading webpages.

Search

  • puppeteer: search for webpages using puppeteer.
  • serper: search for webpages using serper
  • googlecustom: search for webpages using google custom search json API
  • google: search for webpages using google python library

Run ./gen_links to summary all links into a yml file.

download

See download.

Currently, support webpages and pdfs.

LICENSE

MIT