Skip to content

Latest commit

 

History

History
16 lines (7 loc) · 584 Bytes

README.md

File metadata and controls

16 lines (7 loc) · 584 Bytes

Wikipedia Scraper

A library for scraping data from Wikipedia. Can be useful in Natural Language Processing, text processing etc. The library can also perform certain tasks on the scraped text such as removing punctutations,numbers,citations, converting text into lower case and tokenization

Getting Started

How to scrape data from the wikipedia article using this library

from wiki_scraper import WikiScraper

scraper = WikiScraper('India')

text = WikiScraper.get_data(remove_punctuations=False,remove_numbers=False,lower_case=False,remove_citations=False,tokenization=False)