-
Notifications
You must be signed in to change notification settings - Fork 0
emschorsch/nlp-midterm-project
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
README file for the midterm project for cs65 Steve Dini and Emanuel Schorsch ============================================== Contents of this directory: i. trie.py -Class implementation for the trie data structure. Supported external methods include insert, build, successor and predecessor counts. ii. counts.py written by Steve Dini -basic implementation for word segmentation based on just successor and predecessor counts as explained in the Harris paper. iii. varieties.py written by Emanuel Schorsch -contains the other implementations based on the Hafer paper. Implemented methods include: a) Reverse cutoff (k=14) b) Reverse cutoff (k=22) c) Duo cutoff (k1=2, k2=4) d) Sum cutoff (k=22) e) Duo Peaks f) Sum Peaks g) Negative Frequency iv) dejean.py written by Steve Dini -contains an implementation of Dejean's algorithm absent the contextual segmentation described as the last step v) stats.py -helper module for getting values for cuts made, number of expected correct cuts as well as the number of correct cuts actually made. -Also has support for computing precision and recall
About
No description, website, or topics provided.
Resources
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published