- Download Wikipedia-Dump file enwiki-*.xml.bz2 (https://dumps.wikimedia.org/enwiki/)
- Convert downloaded archived XML to JSON by executing xmlparse.py script
- After that it could be used for executing scripts from ./spark/* to get data about articles with the most comments, the most edits by month, edits by authors and etc.
-
Notifications
You must be signed in to change notification settings - Fork 2
armoko/WikipediaDataAnalyzer
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
About
No description, website, or topics provided.
Resources
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published