Gazette Machine Scrapers

These are Scrapy for Gazette Machine. They are run from Zyte and the scraped URLs are posted into S3, from where Gazette Machine pulls them in.

Development

To develop locally:

To deploy:

Install the Scraping Hub commandline client with pip install shub
Run shub deploy
In Zyte configure the spider's AWS and output settings, similar to the other spiders.
In gazettemachine, update settings.GM['SCRAPINGHUB_SPIDERS'] to include the new spider, if it should be run daily.

Name		Name	Last commit message	Last commit date
Latest commit History 66 Commits
gazettemachine		gazettemachine
.gitignore		.gitignore
README.md		README.md
requirements.txt		requirements.txt
scrapinghub.yml		scrapinghub.yml
scrapy.cfg		scrapy.cfg
setup.py		setup.py