-
Notifications
You must be signed in to change notification settings - Fork 0
/
README
31 lines (27 loc) · 1.18 KB
/
README
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
README file for the midterm project for cs65
Steve Dini and Emanuel Schorsch
==============================================
Contents of this directory:
i. trie.py
-Class implementation for the trie data structure. Supported external
methods include insert, build, successor and predecessor counts.
ii. counts.py written by Steve Dini
-basic implementation for word segmentation based on just successor and
predecessor counts as explained in the Harris paper.
iii. varieties.py written by Emanuel Schorsch
-contains the other implementations based on the Hafer paper. Implemented
methods include:
a) Reverse cutoff (k=14)
b) Reverse cutoff (k=22)
c) Duo cutoff (k1=2, k2=4)
d) Sum cutoff (k=22)
e) Duo Peaks
f) Sum Peaks
g) Negative Frequency
iv) dejean.py written by Steve Dini
-contains an implementation of Dejean's algorithm absent the contextual
segmentation described as the last step
v) stats.py
-helper module for getting values for cuts made, number of expected
correct cuts as well as the number of correct cuts actually made.
-Also has support for computing precision and recall