Skip to content

A flexible Python web scraping tool that extracts structured data from web pages and saves it directly to a CSV file using requests and BeautifulSoup.

Notifications You must be signed in to change notification settings

AjmalDevala/WebScraper

Repository files navigation

Web Scraper to CSV

📌 Project Overview

This Python-based web scraping tool allows you to extract data from web pages and save it directly to a CSV file. The scraper is designed to be flexible, handling various HTML structures and capturing key elements like images, links, headings, and section context.

✨ Features

  • 🌐 Fetch webpage content using requests
  • 🍲 Parse HTML with BeautifulSoup
  • 📊 Extract multiple data points:
    • Image sources
    • Hyperlinks
    • Headings
    • Section context
  • 💾 Save data to CSV, preserving duplicates

🚀 Installation

# Clone the repository
git clone https://github.com/AjmalDevala/WebScraper.git

# Navigate to the project directory
cd web scraper

# Install required dependencies
pip install requests beautifulsoup4

🛠 Usage

# Customize the URL and output file
url = "https://www.example.com/target-page"
output_csv = "scraped_data.csv"

scraper = WebScraperToCSV(url, output_csv)
scraper.run()

📦 Dependencies

  • requests
  • beautifulsoup4
  • CSV (built-in)

👤 Author

Ajmal Devala

🔗 Connect With Me

LinkedIn GitHub Email

🤝 Contributing

Contributions, issues, and feature requests are welcome! Feel free to check issues page.

📝 License

This project is MIT licensed.

About

A flexible Python web scraping tool that extracts structured data from web pages and saves it directly to a CSV file using requests and BeautifulSoup.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages