Scraper-books_to_scrape

Summary

This script will extract data from the website http://books.toscrape.com/ and generate several csv files containing information about those books.

product_page_url
universal_product_code (upc)
title
price_including_tax
price_excluding_tax
number_available
product_description
category
review_rating
image_url

One csv file per category will be created, inside a folder named "csv_files".

Images will also be saved in a folder named "img" and in the sub folder "category".

Usage

Clone the repository on your system, for example in ~/scraper

You then go in the folder using the command line.

First you need the virtual environment :

python3 -m venv env
source env/bin/activate

(env) should now be displayed on the left of your prompt

To download all the library, you can do this command :

pip install -r requirements.txt

This will install all the dependancies necessary to run this script.

Then type

python3 extract.py

You may need to use python instead of python3 depending on your system.

Et voila ! After around a minute you should have 2 directory, one "img" and one "csv_files".

"csv_files" contains 50 csv files. One per category. In each of those, you will find a list of the books. "img" contains all 50 categories and inside you will find the files named such as "universal_product_code.jpg"

Name		Name	Last commit message	Last commit date
Latest commit History 37 Commits
.gitignore		.gitignore
LICENSE.txt		LICENSE.txt
README.md		README.md
extract.py		extract.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Scraper-books_to_scrape

Summary

Usage

About

Releases

Packages

Languages

License

Meez25/Scraper-books_to_scrape

Folders and files

Latest commit

History

Repository files navigation

Scraper-books_to_scrape

Summary

Usage

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages