This notebook extracts automatically more than 90'000 web articles from ilfattoquodiano.it and larepubblica.it The script extracts the content of the articles and its metadata and creates a database to store the data.
Tools used:
- xml.etree.ElementTree
- pandas
- BeautifulSoup
- urlopen
- requests