-
Notifications
You must be signed in to change notification settings - Fork 45
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Stop words removal #5
Comments
Hello Gerald, The "go-porterstemmer" algorithm does NOT remove stop words. (It just implements the "porter stemmer" stemming algorithm (in Go).) But you could code it yourself pretty easily. First, you want a list of all the English stop words. There are various lists of them on the Internet. Here are some lists:
Then just extract that list of stop words and get it into your Go code. And before you send something to the "go-porterstemmer" check if it in your list of stop words. |
Hell Charles, thanks for your reply. Already implemented :) does porter handle unicode? e.g. don\u2019t does porter stemmer handle negation markers like: don’t, doesn’t, won’t, can’t, Thanks, 2014-08-02 7:02 GMT+02:00 Charles [email protected]:
|
Hi,
I currently test it with golang 1.3.
does go-porterstemmer also remove stop words or can you suggest a lib?
Gerald
The text was updated successfully, but these errors were encountered: