-
Notifications
You must be signed in to change notification settings - Fork 209
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Consider adding extractFromDocument() to TurkishSentenceExtractor #102
Comments
maybe for now only providing this would be enough?
|
Ok, like this?
in the same class or in a utility class.. |
So the usage will be like:
or directly
So I am not sure now, maybe your initial suggestion was not bad, like adding a new method like extractFromDocument that explains it uses line breaks as paragraph endings. Your call bro. |
Thanks, This is not a pressing issue anyway. I will make one of these and see if it will stick. |
There is a possibility of using objects like Document, Paragraph etc, but that is another issue. So your initial suggestion is fine I think, maybe having separate method names like extractWords , extractSentences and extractParagraphs could be better instead of overloading. |
Agreed, soon there will be a need for such structures. But when they come, overloading those methods may suffice. I will go with my initial suggestion then. |
…ractor #102 Remove SentenceExtractor interface make Span class public (Perhaphs later move it to core). Add some documentation.
Currently there are two methods
Those methods ignore line breaks as a sentence boundary.
A common use is to extract sentences from a complete document represented as a single String. Which contains line breaks. So this method would do the following
also other method name can be changed to extractFromParagraph() if this is added.
@mdakin wdyt?
The text was updated successfully, but these errors were encountered: