Universal Webscraper AI

Introduction

Universal Webscraper is a powerful tool designed to extract data points such as company websites, descriptions, founders, emails, addresses, and more, based on a given entity name. The tool accepts input as a CSV file with a column named Entity containing the entity names to search and retrieve the requested data points..

Tools Used

Jina AI for scrape
Tavily AI for internet search

You can optionally switch to FireCrawl as needed.

Requirements

Clone this Repository:

  git clone https://github.com/jayaraj/universal-scraper-langgraph.git
  cd universal-scraper-langgraph

Install Poetry & Create Environment:

Install Poetry if you haven’t already:

  pip install poetry

Install dependencies and activate the virtual environment:

  poetry install --no-root
  poetry shell

Create a .env File:

Obtain your API keys for OpenAI, Tavily AI, and FireCrawl.
Update the .env file with your API keys:

  OPENAI_API_KEY = "xxxxxxxxxxxxxxxxxxx"
  TAVILY_API_KEY = "xxxxxxxxxxxxxxxxxxx"
  FIRECRAWL_API_KEY="xxxxxxxxxxxxxxxxxxx"

Prepare Input File:

Update input.csv with the entity names you want to search

Usage

Run the scraper with the following commands:

Default run:

  python app.py

Specify an input file:

  python app.py -f ./input.csv

or

  python app.py --file ./input.csv

Name		Name	Last commit message	Last commit date
Latest commit History 18 Commits
img		img
tools		tools
.gitignore		.gitignore
README.md		README.md
agent_nodes.py		agent_nodes.py
app.py		app.py
graph.py		graph.py
input.csv		input.csv
internetsearch.py		internetsearch.py
pyproject.toml		pyproject.toml
websitescrap.py		websitescrap.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Universal Webscraper AI

Introduction

Tools Used

Requirements

Usage

About

Releases

Packages

Languages

jayaraj/universal-scraper-langgraph

Folders and files

Latest commit

History

Repository files navigation

Universal Webscraper AI

Introduction

Tools Used

Requirements

Usage

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages