Skip to content

Conversation

reorx
Copy link

@reorx reorx commented Feb 2, 2013

As Python supports both __str__ and __unicode__ method for objects' string conversion,
it is suggested to return <type 'str'> for __str__, and <type 'unicode'> for __unicode__.

This is my understanding about the difference between this two methods.
So I think it would be better to return str explicitly for __str__,
and add __unicode__ for possible use.

I have encountered a problem when I formatting strings with ObjectId:

>>> '%s %s' % ('\xe5\x90\x8d\xe5\xad\x97', ObjectId())
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
UnicodeDecodeError: 'ascii' codec can't decode byte 0xe5 in position 0: ordinal not in range(128)

The reason why this problem happen is because Python tried to encode every argument since he found an unicode in the arguments, but the first one has already been encoded as utf8, the re-encode action then cause error.

As Python supports both `__str__` and `__unicode__` method for objects' string conversion,
it is suggested to return `<type 'str'>` for `__str__`, and `<type 'unicode'>` for `__unicode__`.

This is my understanding about the difference between this two methods.
So I think it would be better to return str explicitly for `__str__`,
and add `__unicode__` for possible use

I have encountered a problem when I formatting strings with ObjectId:

```python
>>> '%s %s' % ('\xe5\x90\x8d\xe5\xad\x97', ObjectId())
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
UnicodeDecodeError: 'ascii' codec can't decode byte 0xe5 in position 0: ordinal not in range(128)
```

The reason why this problem happen is because Python tried to encode every argument since he found an unicode in the arguments, but the first one has already been encoded as utf8, the re-encode action then cause error.
@behackett
Copy link
Member

Hi. Thanks for sending this patch. Unfortunately it won't work in python 3:

Python 3.3.0 (default, Nov 24 2012, 22:25:56)
[GCC 4.6.3] on linux
Type "help", "copyright", "credits" or "license" for more information.

from bson.objectid import ObjectId
oid = ObjectId()
oid
Traceback (most recent call last):
File "", line 1, in
File "./bson/objectid.py", line 253, in repr
return "ObjectId('%s')" % (str(self),)
TypeError: str returned non-string (type bytes)

I think a workable solution is to not add unicode and instead do this:

def str(self):
if PY3:
return binascii.hexlify(self.__id).decode()
return binascii.hexlify(self.__id)

@reorx
Copy link
Author

reorx commented Feb 3, 2013

@behackett Hum, you are right, I really forgot to think about Python 3's situation, then, should I close the current pull request and send a new one, or just you guys finish the changing?

@behackett
Copy link
Member

You can just do another commit to fix it and update the pull request. Thanks again for working on this.

@behackett
Copy link
Member

I fixed this in the following commit. Thanks for reporting the issue.

c8f6d4a

@behackett behackett closed this Feb 9, 2013
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants