Skip to content

Conversation

@jonmcoe
Copy link
Contributor

@jonmcoe jonmcoe commented Oct 18, 2015

Few notes:

1) Two tests failed for me both before, and after, my changes

This is probably something with my setup, but not sure what. I am using I am using python 2.7.10 and followed the instructions in https://github.com/sloria/TextBlob/blob/dev/CONTRIBUTING.rst with the only addition being the download_corpora. I don't think I experienced these failures when working on #96 , but I am getting them now even when I checkout far into the past (8904336 for example) so that's why I feel confident it is something in my setup. Any ideas?

======================================================================
FAIL: test_correct (tests.test_blob.SentenceTest)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/Users/jon/TextBlob/tests/test_blob.py", line 217, in test_correct
    assert_equal(blob.correct(), tb.Sentence("I have bad spelling."))
AssertionError: Sentence("") != Sentence("I have bad spelling.")

======================================================================
FAIL: test_correct (tests.test_blob.TextBlobTest)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/Users/jon/TextBlob/tests/test_blob.py", line 778, in test_correct
    assert_equal(blob.correct(), tb.TextBlob("I have bad spelling."))
AssertionError: TextBlob("") != TextBlob("I have bad spelling.")

----------------------------------------------------------------------
Ran 222 tests in 43.071s

FAILED (failures=2)

2) Why aren't plural_categories['uninflected']/['uncountable'] and singular_uninflected/uncountable the exact same lists?

I looked into the differences, the only difference is that there are two singular_uninflected not in its counterpart. I'm guessing this is an artifact of http://www.bermi.org/inflector/ vs http://www.csse.monash.edu.au/~damian/papers/HTML/Plurals.html

>>> set(plural_categories['uninflected']) - set(singular_uninflected)
set([])
>>> set(singular_uninflected) - set(plural_categories['uninflected']) 
set(['swiss', 'georgia'])
>>> set(plural_categories['uncountable']) - set(singular_uncountable)
set([])
>>> set(singular_uncountable) - set(plural_categories['uncountable'])
set([])

3) high-jinks pluralizes wrong

>>> pluralize('high-jinks')
'high-jinkss'

I'm guessing this is due to the hyphen, and something about the treatment of rules in the for loop around inflect.py:277. I didn't really dig yet, and it seemed a little bit heavy of an issue to just lump into this pull request anyway.


Anyway, love your library. Been happily using the classifiers and sentiment analysis for a little while now. Just found these few issues with pluralization/singularization and am happy to contribute.

@jonmcoe
Copy link
Contributor Author

jonmcoe commented Oct 18, 2015

so those tests also failed on travis, and for me when I tried on master. eager to hear your thoughts

@jonmcoe
Copy link
Contributor Author

jonmcoe commented Oct 18, 2015

i figured out the unittest failures. they are due to nltk version 3.1. runs fine when i downgrade 3.0.5.
https://github.com/nltk/nltk/blob/develop/ChangeLog

sorry to dump all of this here. can write up an issue if you'd like me to. don't know enough about about nltk and the spelling functions to fix it, but i could set it up to pin to nltk==3.0.5 if you think that's the best move

.gitignore Outdated
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm -0 on adding things like this to project gitignores. This should really be in a global gitignore. See https://help.github.com/articles/ignoring-files/#create-a-global-gitignore

@sloria
Copy link
Owner

sloria commented Oct 18, 2015

Thanks @jonmcoe for the patch!

I'll try to look more closely at the test failure within the next day or two. I would rather not pin to older versions of NLTK.

Re: your questions about inflection--TextBlob uses pattern's implementation from https://github.com/clips/pattern/blob/master/pattern/text/en/inflect.py with just a few modifications to support Python 3. I recommend directing your questions on that repo, and possibly sending this patch there if you're feeling generous. =).

Just a few changes to make above. Once those and the failing tests are addressed, this should be good to merge.

@jonmcoe
Copy link
Contributor Author

jonmcoe commented Oct 18, 2015

sounds good! thanks for explaining

@sloria
Copy link
Owner

sloria commented Oct 31, 2015

The failing tests have been addressed on dev. This is good to merge. Thanks again @jonmcoe!

sloria added a commit that referenced this pull request Oct 31, 2015
@sloria sloria merged commit 4923458 into sloria:dev Oct 31, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants