Skip to content

Latest commit

 

History

History

Lab3

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
 
 
 
 

Lab3 - Word count in text

The aim of this lab is to determine statistics on word occurences in a text (here Romeo & Juliet by Shakespeare) using a dictionary.

For the least frequent, many words have the same occurances. Therefore we did not plot the graph. Please refer to the printed dictionary.

Comment: The most frequent words' occurances are scattered while the least frequent words' occurrances are much more centralized.

For the bonus question, the graph looks like a straight line. Zipf's law states that given a large sample of words used, the frequency of any word is inversely proportional to its rank in the frequency table. This means that if we plot word count and rank in a log-log graph with large enough sample of words, it should be a straight line.

Built With

Authors