This Python-based web scraping tool allows you to extract data from web pages and save it directly to a CSV file. The scraper is designed to be flexible, handling various HTML structures and capturing key elements like images, links, headings, and section context.
- 🌐 Fetch webpage content using
requests
- 🍲 Parse HTML with
BeautifulSoup
- 📊 Extract multiple data points:
- Image sources
- Hyperlinks
- Headings
- Section context
- 💾 Save data to CSV, preserving duplicates
# Clone the repository
git clone https://github.com/AjmalDevala/WebScraper.git
# Navigate to the project directory
cd web scraper
# Install required dependencies
pip install requests beautifulsoup4
# Customize the URL and output file
url = "https://www.example.com/target-page"
output_csv = "scraped_data.csv"
scraper = WebScraperToCSV(url, output_csv)
scraper.run()
requests
beautifulsoup4
CSV
(built-in)
Ajmal Devala
Contributions, issues, and feature requests are welcome! Feel free to check issues page.
This project is MIT licensed.