Skip to content

Commit a29e09d

Browse files
committed
Readme update, ignore books.json
1 parent 375b6f6 commit a29e09d

File tree

3 files changed

+24
-7003
lines changed

3 files changed

+24
-7003
lines changed

.gitignore

+3
Original file line numberDiff line numberDiff line change
@@ -163,5 +163,8 @@ cython_debug/
163163
# option (not recommended) you can uncomment the following to ignore the entire idea folder.
164164
.idea/
165165

166+
# MyFiles
167+
books.json
168+
166169
# End of https://www.toptal.com/developers/gitignore/api/python
167170

README.md

+21-1
Original file line numberDiff line numberDiff line change
@@ -1,2 +1,22 @@
11
# Books-To-Scrape
2-
My approach to books.toscrape.com with Scrapy CrawlSpider
2+
My approach to books.toscrape.com with Scrapy CrawlSpider and ItemLoaders
3+
4+
#### Requirements
5+
- python3
6+
- pip3
7+
- itemadapter==0.6.0
8+
- itemloaders==1.0.4
9+
- Scrapy==2.6.1
10+
11+
#### Instructions
12+
Clone the repo and install dependencies, dependencies are available at *requirements.txt*
13+
```bash
14+
git clone https://github.com/egarcia2506/books-to-scrape
15+
cd books-to-scrape
16+
pip3 install -r requirements.txt
17+
```
18+
19+
#### To run the Crawler and get the output in a JSON file
20+
```bash
21+
scrapy crawl book -o books.json
22+
```

0 commit comments

Comments
 (0)