Disclaimer: This repository is developed and released for educational purposes. Use at your own risk.
This repository crawls the top visited 100 websites and extracts unique URLs to be used for
generating a dataset of unique real-world URL examples. The following script creates a out.txt
file with each line containing a different URL.
This project uses Node.js. We recommend running the following with code with at least Node 18.
- For installing dependencies, run
npm install
- To execute the script run
npm start
and the output will be writtenout.txt
file.