BA_Project Download Wikipedia-Dump file enwiki-*.xml.bz2 (https://dumps.wikimedia.org/enwiki/) Convert downloaded archived XML to JSON by executing xmlparse.py script after that it could be used for executing any of the script from spark directory