Encoding

Personal project to decode badly encoded json file - Later extracted statistics from conversations with friends

I received a .json file from facebook and all the non-english characters looked like \u00a0 instead of α

Fancy general solution: Decoding.py Take each word as binary data, and then decode it to hex format and then with the correct "utf-8" format Problem: For some reason this takes away any formating, tabs, enters etc.
First try solution: First_try.py Find 'content' word. In that line, take the sentence, encode to latin-1 and then utf-8. Then add 'content: ' word to keep the formating. Problem: This keeps the formating but is specific to my case.
Also I played around with making the .py file to .exe

	\u00ce\u009a\u00ce\u00b1\u00ce\u00bb\u00ce\u00b7\u00ce\u00bc\u00ce\u00ad\u00cf\u0081\u00ce\u00b1 \u00f0\u009f\u0098\u0098

Is decoded to:

	Καλημέρα

TO-DO:

Find a solution that is as general as Decoding.py and keep the formating as First_try.py

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
.idea		.idea
build		build
dist		dist
venv		venv
Decoding.py		Decoding.py
First_Try.py		First_Try.py
README.md		README.md
dropping_file_test.spec		dropping_file_test.spec
helloworld.spec		helloworld.spec
somethin.spec		somethin.spec
something.spec		something.spec
testinf.spec		testinf.spec