Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improvements and usability #6

Open
wants to merge 4 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
42 changes: 29 additions & 13 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# Chat_Analyzer<br><br>
# WhatsApp Chat Analyzer<br><br>

[![forthebadge made-with-python](http://ForTheBadge.com/images/badges/made-with-python.svg)](https://www.python.org/)<br>
[![GitHub license](https://img.shields.io/github/license/Naereen/StrapDown.js.svg)](https://github.com/subahanii/Whatsapp-Chat-Analyzer/blob/master/LICENSE)
Expand All @@ -7,13 +7,13 @@


## INTRODUCTION:
This script for whatsapp group/individual chat analyzer .<br>
This script is able to analyse all activity happened in whatsapp group and visualize all thing through matplotlib library.
This script analysis whatsApp group/individual chat.<br>
This script is able to analyse all the activities happened in a whatsApp group or chat and visualize all thing in a graph through matplotlib library.
<br><br><br>
## Application:
1- Count total chat.<br>
2- Count total chat person wise.<br>
3- showing top active member.<br>
3- Showing top active member.<br>
4- How many messages are deleted during conversation.<br>
5- Identify more conversation days.<br>
6- Identify current Admin and how many admin changed till now.<br>
Expand All @@ -28,18 +28,34 @@ This script is able to analyse all activity happened in whatsapp group and visua

<br><br><br>
## Prerequisites:
1- Python interprater with matplotlib,pandas,NumPy library*<br>
1- Python interpreter with matplotlib, pandas and NumPy library*<br>
2- Jupyter NoteBook (optional)<br>
3- Whatsapp chat data* (.txt file)--->Dataset<br>
## How to found chat data from Whatsapp:
1-open whatsapp group<br>
2-tap right-uper corner and goto more option<br>
3- tap on "Export Chat" <br>
4- tap on "Without Media"<br>
5- finally you will get .txt file well done.<br>
3- whatsApp chat data* (.txt file)--->Dataset<br>

## Environment setup
1) Clone repo to your machine.
```sh
git clone https://github.com/subahanii/Whatsapp-Chat-Analyzer
```
2) Install dependencies.
```sh
pip install -r requirements.txt
```
3) Set path of the whatsApp chat data* (.txt file) in "chatAnalyzer.py" file
```python
file_path = "C:\\Users\\asus\\Downloads\\WhatsApp Chat with Unemployed peeps.txt"
```
4) Finally you can run the program.<br>

## How to find chat data from whatsApp:
1- Open whatsApp group<br>
2- Tap right-upper corner and goto more option<br>
3- Tap on "Export Chat" <br>
4- Tap on "Without Media"<br>
5- Finally you will get the .txt file, well done.<br>

### Contributions :smiley:[![contributions welcome](https://img.shields.io/badge/contributions-welcome-brightgreen.svg?style=flat)](https://github.com/subahanii/Whatsapp-Chat-Analyzer/issues)
Contributions always welcome .
Contributions are always welcome.

### Thank You.:pray:
##### If you like this please appreciate by giving start and Fork this repo.Thank You...! :clap: :clap: :clap:
Expand Down
147 changes: 73 additions & 74 deletions chatAnalyzer.py
Original file line number Diff line number Diff line change
@@ -1,112 +1,111 @@
"""
Author : Ghulam Mohiyuddin
What is this: This is a whatsapp group/indivisual chat analyser
What is this: This is a whatsapp group/indivisual chat analyser
Note: This is not a final, wait for more functionality
"""



import os
import re
import pandas as pd
import matplotlib.pyplot as plt
link="C:\\Users\\asus\\Downloads\\WhatsApp Chat with Unemployed peeps.txt"
tit=link.split("\\")
title=tit[-1]
title1=title[:len(title)-4:]
print(title1)
cht=open(link,encoding="utf8")
list_of_date_time_author_msg=[]
total_msg=0
total_msg_and_notification=0
list_of_notification=[]
total_valid_msg=0

file_path = "C:\\Users\\MEEZAN MALEK\\Desktop\\WhatsApp Chat with Takvim.txt"
array_of_path = file_path.split("\\")

# Selecting the last element of array as title and removing last 4 characters ".txt"
title = (array_of_path[-1])[: len(array_of_path[-1]) - 4]
print(title)

cht = open(file_path, encoding="utf8")
list_of_date_time_author_msg = []
total_msg = 0
total_msg_and_notification = 0
list_of_notification = []
total_valid_msg = 0


def startsWithDate(s):
#ReGex finding date and time
# ReGex finding date and time
pattern = "^([0-2][0-9]|(3)[0-1])(\/)(([0-9])|((0)[0-9])|((1)[0-2]))(\/)(\d{2}|\d{4}), ([0-9][0-9]|[0-9]):([0-9][0-9])"
result = re.match(pattern, s)
if result:
return True
return False

def findColon(s):#to know msg is valid or not
n=len(s)
c=0
for i in range(n):
if s[i]==":":
c+=1
return c #return no. of colon in a msg if 0 then this msg is not a valid msg






def findColon(s): # to know msg is valid or not
n = len(s)
c = 0
for i in range(n):
if s[i] == ":":
c += 1
return c # return no. of colons in a msg, if 0 then this msg is not a valid msg.


while 1:
rd=cht.readline()
if not rd:break

total_msg_and_notification+=1
if startsWithDate(rd):#to know msg is start with date or no not.

splitLine=rd.split("-")
dateTime=splitLine[0]

date,time=dateTime.split(',')
total_msg+=1

if findColon(splitLine[1])>0:# to know this line is genuene msg or notification .
total_valid_msg+=1
authorMsg=splitLine[1].split(":")


author= authorMsg[0][:15]+".."
msg=authorMsg[1::]

list_of_date_time_author_msg.append([date,time,author,msg])

else:
list_of_notification.append(splitLine[1])#collect all notification such as: some add someone,someone join this group via link etc.

current_line = cht.readline()
if not current_line:
break

total_msg_and_notification += 1
if startsWithDate(current_line): # To check the msg starts with date or not.

splitLine = current_line.split("-")
dateTime = splitLine[0]

date, time = dateTime.split(',')
total_msg += 1



print("\n\nTotal msg-",total_msg,"\nTotal valid msg-",total_valid_msg,"\ntotal_msg_and_notification",total_msg_and_notification)
if findColon(splitLine[1]) > 0: # To know this line is genuine msg or notification.
total_valid_msg += 1
authorMsg = splitLine[1].split(":")

df=pd.DataFrame(list_of_date_time_author_msg,columns=["Date","Time","GroupMember","Message"])
author = authorMsg[0][:15] + ".."
msg = authorMsg[1::]

l=dict(df['GroupMember'].value_counts())
xval=[]
yval=[]
for x,y in l.items():
list_of_date_time_author_msg.append([date, time, author, msg])

else:
list_of_notification.append(splitLine[1])
# collect all notification such as: someone added new member,someone join this group via file_path etc.

print("\n\nTotal msg =", total_msg, "\nTotal valid msg =", total_valid_msg, "\nTotal msg and notification =",
total_msg_and_notification)

df = pd.DataFrame(list_of_date_time_author_msg, columns=["Date", "Time", "GroupMember", "Message"])

l = dict(df['GroupMember'].value_counts())
xval = []
yval = []
for x, y in l.items():
xval.append(x)
yval.append(y)


def showAll():
plt.figure(figsize=(len(l) * 0.25, 10))
plt.bar(xval, yval, width=0.8)

plt.figure(figsize=(len(l)*0.25,10))
plt.bar(xval,yval,width=0.8)

plt.title("Group: "+title1)
plt.title("Group: " + title)
plt.xlabel("Group Members")
plt.ylabel("Number of messages")
plt.xticks(xval,rotation=90)
plt.xticks(xval, rotation=90)


showAll()
c3=0
def autolabel(x,y):
c3 = 0


def autolabel(x, y):
global c3
for i in range(len(x)):
plt.text(x[i], y[i] + 5, str(y[i]), ha='center', rotation=90, color='red')
c3 += y[i]


plt.text(x[i],y[i]+5,str(y[i]),ha='center',rotation=90,color='red')
c3+=y[i]
autolabel(xval,yval)
plt.text(len(l)-len(l)//2,len(l), 'Total Active Members: '+str(len(l))+", Total Message-"+str(c3),color='red')
autolabel(xval, yval)
plt.text(len(l) - len(l) // 2, len(l), 'Total Active Members: ' + str(len(l)) + ", Total Message-" + str(c3),
color='red')

plt.show()
#plt.savefig('test.png')
# plt.savefig('test.png')
2 changes: 2 additions & 0 deletions requirements.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
pandas
matplotlib