Skip to content

Latest commit

 

History

History
75 lines (45 loc) · 4.88 KB

metrics.md

File metadata and controls

75 lines (45 loc) · 4.88 KB

Segmentation metrics overview

Metric Needs ref Boundary types Near misses Different sequences Implementation
F1 ✔️(1)
Pk ✔️(1) ✔️ SegEval
WindowDiff ✔️(1) ✔️ SegEval
Segmentation Similarity ✔️(1) ✔️ ✔️ SegEval
Boundary Similarity ✔️(1) ✔️ ✔️ SegEval
BLEU(-br) ✔️(n) ✔️ ✔️ ✔️ SacreBLEU?
TER-br ✔️(n) ✔️ ✔️ ✔️ TER
S-BLEU ✔️(n) ✔️ ✔️ ✔️ SacreBLEU

Pk

Beeferman99statistical

Measures the probability that two units k steps apart are incorrectly labeled as being in different segments. Is calculated by setting k to half of the average true segment size and then computing penalties via a moving window of length k. At each location, the algorithm determines whether the two ends of the probe are in the same or different segments in the reference segmentation and increases a counter if the algorithm’s segmentation disagrees. The resulting count is scaled between 0 and 1 by dividing by the number of measurements taken.

formula

WindowDiff

Pevzner02critique

For each position of the window, compares the number of reference segmentation boundaries that fall in this interval (b(ref,i,i+k)) with the number of boundaries that are assigned by the algorithm (b(hyp,i,i+k)). The algorithm is penalized if b(ref,i,i+k) != b(hyp,i,i+k)

formula

Segmentation Similarity

Fournier12segmentation

Proportion of boundaries that are not transformed (added/deleted, substituted) when comparing them using edit distance (transposition allowed up to n steps).

formula

Boundary Similarity

Fournier13evaluating

New weights and new normalization for boundary edit distance. Assuming that boundary edit distance produces sets of edit operations where A is the set of additions/deletions, T the set of n-wise transpositions, S the set of substitutions, and M the set of matching boundary pairs, boundary similarity can be defined as:

formula

BLEU(-br)

karakanta2042

BLEU computed with the data containing breaks as special symbols. Each break symbol counts as an extra token that contributes to the score.

TER-br

karakanta2042

TER calculated with all tokens of the sentence masked.

S mode BLEU (S-BLEU)

Matusov19customizing

Subtitle BLEU. Calculates BLEU on subtitles instead of sentences, so that any target words that appear in the wrong subtitle count as error. Assumes that the subtitles in the target and the reference match.

Conformity to the subtitle constraint of length (CPL_conf)

Subtitles should not exceed a specific length. Conformity is measured as a maximum subtitle length of n characters per line (maximum 2 lines of up to n characters each for the subtitle block), where n is 42 according to TED subtitling guidelines. CPL_conformity is the percentage of subtitles in the corpus conforming to the length constraint.

Adapting standard metrics via alignment

MWER (Matusov et al., 2006)