Commit a29e09d 1 parent 375b6f6 commit a29e09d Copy full SHA for a29e09d
File tree 3 files changed +24
-7003
lines changed
3 files changed +24
-7003
lines changed Original file line number Diff line number Diff line change @@ -163,5 +163,8 @@ cython_debug/
163
163
# option (not recommended) you can uncomment the following to ignore the entire idea folder.
164
164
.idea /
165
165
166
+ # MyFiles
167
+ books.json
168
+
166
169
# End of https://www.toptal.com/developers/gitignore/api/python
167
170
Original file line number Diff line number Diff line change 1
1
# Books-To-Scrape
2
- My approach to books.toscrape.com with Scrapy CrawlSpider
2
+ My approach to books.toscrape.com with Scrapy CrawlSpider and ItemLoaders
3
+
4
+ #### Requirements
5
+ - python3
6
+ - pip3
7
+ - itemadapter==0.6.0
8
+ - itemloaders==1.0.4
9
+ - Scrapy==2.6.1
10
+
11
+ #### Instructions
12
+ Clone the repo and install dependencies, dependencies are available at * requirements.txt*
13
+ ``` bash
14
+ git clone https://github.com/egarcia2506/books-to-scrape
15
+ cd books-to-scrape
16
+ pip3 install -r requirements.txt
17
+ ```
18
+
19
+ #### To run the Crawler and get the output in a JSON file
20
+ ``` bash
21
+ scrapy crawl book -o books.json
22
+ ```
You can’t perform that action at this time.
0 commit comments