JDSpider

git clone https://github.com/Dengqlbq/JDSpider.git

Override the following content

ProjectStart/Test.py (redis configuration, keywords, page_count)
JDUrlsSpider/settings.py (redis configuration)
JDDetailSpider/settings.py (redis configuration, mysql configuration， DOWNLOAD_DELAY)
JDCommentSpider/settings.py (redis configuratin, mysql configuration， DOWNLOAD_DELAY)

cd ProjectStart
python Test.py

cd JDUrlsSpider
scrapy crawl JDUrlsSpider

cd JDDetailSpider
scrapy crawl JDDetailSpider
(This is distributed crawler, you can run more than one JDDetailSpider)

cd JDCommentSpider
scrapy crawl JDCommentSpider
(This is distributed crawler, you can run more than one JDCommentSpider)

Note:

Before you run the project, make sure that you have created tables match the requirement.
If you did not build a proxy_pool, disable the "ProxyMiddleware" in JDCommetSpider/settings.py

Product detail and comment summary
Some comments

Full comment

Provide feedback