-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
update abbreviation handling #3
Conversation
… assume the abbreviation is at the end of the string
if word not in not_acronyms: | ||
if word in self.abbrDict.keys(): | ||
full = self.abbrDict[word] | ||
splitWord = re.split(r'[-\d+]', word) # split word if word contains '-' or numbers |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should we perform this treatment before the abbreviation handling (in the text pre-processing steps)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
your are right, fixed.
if word in self.abbrDict.keys(): | ||
full = self.abbrDict[word] | ||
splitWord = re.split(r'[-\d+]', word) # split word if word contains '-' or numbers | ||
checkAbbr = word if len(splitWord) == 1 else splitWord[-1] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
see above
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed also.
""" | ||
|
||
# | ||
terms1 = [ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you might want to add "before, while, during"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have added before and during, not sure about while
"year", "years" | ||
] | ||
|
||
# pattern = [ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
are we keeping this commented portion for future development?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
removed.
ent = Span(doc, startToken, endToken + 1, label="Temporal") | ||
newEnts.append(ent) | ||
|
||
## Following is used to add a custom attribute to indicate Temporal |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same as previous comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, I would keep these commented lines.
… assume the abbreviation is at the end of the string
Pull Request Description
What issue does this change request address? (Use "#" before the issue to link it, i.e., #42.)
This PR will update abbreviation handling when number or '-' are combined with the abbreviation and the abbreviation only presented at the end of the combination.
What are the significant changes in functionality due to this change request?
For Change Control Board: Change Request Review
The following review must be completed by an authorized member of the Change Control Board.