This Jupyter Notebook contains code for scraping information about the best-selling books on Amazon. It retrieves data such as book names, authors, ratings, number of customers who rated the book, and prices. The scraped data is stored in a CSV file named amazon_products.csv
.
Before running the notebook, ensure that you have the following libraries installed:
- pandas
- numpy
- matplotlib
- seaborn
- re
- time
- datetime
- matplotlib.dates
- matplotlib.ticker
- urllib
- BeautifulSoup
- requests
You can install these libraries by running the following command:
pip install pandas numpy matplotlib seaborn re time datetime urllib BeautifulSoup requests
- Open the Jupyter Notebook
Amazon_Books_Scraper.ipynb
. - Ensure that you have the necessary libraries installed.
- Execute the code cells in the notebook by pressing Shift+Enter or by clicking the "Run" button.
- The code will scrape the information from the Amazon website and save it in the
amazon_products.csv
file. - The shape of the DataFrame will be displayed as the output.
- You can modify the
no_pages
variable to specify the number of pages to scrape. By default, it is set to 2, but you can increase or decrease it as per your requirements. - The scraped data includes book names, authors, ratings, number of customers who rated the book, and prices. If you need additional information, you can modify the code accordingly.
The scraped data is saved in a CSV file named amazon_products.csv
. You can find this file in the same directory as the Jupyter Notebook.
This code is intended for educational purposes only. Please be respectful of website scraping policies and use this code responsibly.
Assalamu Alaikum Warahmato allah