Skip to content

Conversation

@ghost
Copy link

@ghost ghost commented Feb 1, 2017

So I saw an issue about TextBlob using a lemmatizer but not a stemmer.
Wrote a function for the later.
@sloria

Copy link
Owner

@sloria sloria left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR! This is a good feature. In addition to addressing my comments, can you also add tests for this change?

textblob/blob.py Outdated
#added 'stemmer' on lines of lemmatizer
#based on nltk
def stem(self, isporter=True):
#param isporter: True when using Porter stemmer, else Snowball
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please document this method with a docstring, using the Sphinx syntax (see other methods for examples).

lemmatizer = nltk.stem.WordNetLemmatizer()
return lemmatizer.lemmatize(self.string, pos)

#added 'stemmer' on lines of lemmatizer
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These comments are unnecessary; remove them.

textblob/blob.py Outdated

#added 'stemmer' on lines of lemmatizer
#based on nltk
def stem(self, isporter=True):
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are there other stemmers available in NLTK? If so, I'm not sure a boolean param will suffice here. Perhaps the user could pass an instance of a stemmer, e.g.

blob.stem(stemmer=PorterStemmer())

Copy link
Owner

@sloria sloria left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking much better! Just need to fix the test.

Also, there are a lot of unnecessary commits here--would you mind squashing your commits down? If you don't know how or don't have the time, don't worry too much about it.

w = tb.Word("wolves")
assert_equal(w.stem(), "wolv")
w = tb.Word("went")
assert_equal(w.stem("v"), "went")
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This assertion is failing because of the invalid "v" argument.

@ghost
Copy link
Author

ghost commented Feb 9, 2017 via email

@sloria
Copy link
Owner

sloria commented Feb 9, 2017

The tokenization changes are unrelated to this PR; please remove them.

@sloria
Copy link
Owner

sloria commented Feb 13, 2017

The added test is failing; can you please look into it?

@ghost
Copy link
Author

ghost commented Feb 13, 2017

@sloria I have fixed errors in the stemmer, rest seem to be errors in TextBlob itself. Please look into it.

@sloria
Copy link
Owner

sloria commented Feb 14, 2017

Yes, the errors are unrelated to this PR. Thanks for making those changes. This looks good to go.

@sloria sloria merged commit ffa9c6d into sloria:dev Feb 14, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants