Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Development of API-based errant_compare #50

Open
wants to merge 9 commits into
base: main
Choose a base branch
from
76 changes: 76 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,79 @@
# API-based ERRANT Compare

This forked repository supports API-based errant_compare.

- From raw text
```py
import errant
orig_raw = ['This are gramamtical sentence .']
cor_raw = ['This is grammatical sentence .']
# refs_raw: List[List[str]] = (num_annotation, num_sents)
refs_raw = [
['This is a grammatical sentence .'],
['These are grammatical sentences .']
]
output: errant.compare.ERRANTCompareOutput = errant.compare_from_raw(
orig=orig_raw,
cor=cor_raw,
refs=refs_raw,
beta=0.5, # beta for F-score
cat=3, # can be 1, 2, 3
single=False,
multi=False,
mode='cs', # can be 'cs', 'ds', 'dt', 'cse'
filt=[] # error type filtering
)
print('Overall score')
print(output.overall) # You can also use output.overall.tp (or .fp .fn .precision .recall .f)
print('Error type based scores')
print(output.etype)
print('Selected reference-id for each input')
print(output.best_ref_ids)

'''Output
Overall score
[TP=2, FP=0, FN=1, Precision=1.0, Recall=0.6666666666666666, F_0.5=0.9090909090909091]
Error type based scores
{
'R:VERB:SVA': [TP=1, FP=0, FN=0, Precision=1.0, Recall=1.0, F_0.5=1.0],
'M:DET': [TP=0, FP=0, FN=1, Precision=1.0, Recall=0.0, F_0.5=0.0],
'R:SPELL': [TP=1, FP=0, FN=0, Precision=1.0, Recall=1.0, F_0.5=1.0]
}
Selected reference-id for each input
[0]
'''
```

- From edits
```py
import errant
import pprint
annotator = errant.load('en')
orig_raw = ['This are gramamtical sentence .']
cor_raw = ['This is grammatical sentence .']
# refs_raw: List[List[str]] = (num_annotation, num_sents)
# This contains two annotations for one sentence
refs_raw = [
['This is a grammatical sentence .'],
['These are grammatical sentences .']
]
orig = [annotator.parse(o) for o in orig_raw]
cor = [annotator.parse(c) for c in cor_raw]
refs = [[annotator.parse(r) for r in ref] for ref in refs_raw]
hyp_edits = [annotator.annotate(o, c) for o, c in zip(orig, cor)]
ref_edits = [[annotator.annotate(o, r) for o, r in zip(orig, ref)] for ref in refs]
output: errant.compare.ERRANTCompareOutput = errant.compare_from_edits(
hyp_edits=hyp_edits,
ref_edits=ref_edits,
beta=0.5,
cat=3,
single=False,
multi=False,
mode='cs',
filt=[]
)
```

# ERRANT v3.0.0

This repository contains the grammatical ERRor ANnotation Toolkit (ERRANT) described in:
Expand Down
3 changes: 2 additions & 1 deletion errant/__init__.py
Original file line number Diff line number Diff line change
@@ -1,13 +1,14 @@
from importlib import import_module
import spacy
from errant.annotator import Annotator
from errant.compare import compare_from_raw, compare_from_edits

# ERRANT version
__version__ = '3.0.0'

# Load an ERRANT Annotator object for a given language
def load(lang, nlp=None):
# Make sure the language is supported
# Make sure the language is supported
supported = {"en"}
if lang not in supported:
raise Exception(f"{lang} is an unsupported or unknown language")
Expand Down
Loading