Skip to content

uses graph algorithm (BFS) to crawl and download an entire website

Notifications You must be signed in to change notification settings

aditya6992/webcrawler

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 

Repository files navigation

webcrawler

A very basic web crawler implemented using beautiful soup 4 and native python

extracts urls from any website and using breadth first search traversal, visits each url and in turn extract urls from each webpage and so on... with slight modification this can be used to download an entire website.

usage:

pip install -r requirements.txt

python crawler.py http://stanford.edu/

replace that with any url that you'd like to crawl

About

uses graph algorithm (BFS) to crawl and download an entire website

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages