Skip to content

Commit a73d3cb

Browse files
committed
PYTHON-841 FAQ entry for key order and subdocument matching.
1 parent 5d8194d commit a73d3cb

File tree

1 file changed

+104
-0
lines changed

1 file changed

+104
-0
lines changed

doc/faq.rst

Lines changed: 104 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -98,6 +98,110 @@ For `Twisted <http://twistedmatrix.com/>`_, see `TxMongo
9898
<http://github.com/fiorix/mongo-async-python-driver>`_. Compared to PyMongo,
9999
TxMongo is less stable, lacks features, and is less actively maintained.
100100

101+
Key order in subdocuments -- why does my query work in the shell but not PyMongo?
102+
---------------------------------------------------------------------------------
103+
104+
.. testsetup:: key-order
105+
106+
from bson.son import SON
107+
from pymongo.mongo_client import MongoClient
108+
109+
collection = MongoClient().test.collection
110+
collection.drop()
111+
collection.insert({'_id': 1.0,
112+
'subdocument': SON([('b', 1.0), ('a', 1.0)])})
113+
114+
The key-value pairs in a BSON document can have any order (except that ``_id``
115+
is always first). The mongo shell preserves key order when reading and writing
116+
data. Observe that "b" comes before "a" when we create the document and when it
117+
is displayed:
118+
119+
.. code-block:: javascript
120+
121+
> // mongo shell.
122+
> db.collection.insert( { "_id" : 1, "subdocument" : { "b" : 1, "a" : 1 } } )
123+
WriteResult({ "nInserted" : 1 })
124+
> db.collection.find()
125+
{ "_id" : 1, "subdocument" : { "b" : 1, "a" : 1 } }
126+
127+
PyMongo represents BSON documents as Python dicts by default, and the order
128+
of keys in dicts is not defined. That is, a dict declared with the "a" key
129+
first is the same, to Python, as one with "b" first:
130+
131+
.. doctest:: key-order
132+
133+
>>> print {'a': 1.0, 'b': 1.0}
134+
{'a': 1.0, 'b': 1.0}
135+
>>> print {'b': 1.0, 'a': 1.0}
136+
{'a': 1.0, 'b': 1.0}
137+
138+
Therefore, Python dicts are not guaranteed to show keys in the order they are
139+
stored in BSON. Here, "a" is shown before "b":
140+
141+
.. doctest:: key-order
142+
143+
>>> print collection.find_one()
144+
{u'_id': 1.0, u'subdocument': {u'a': 1.0, u'b': 1.0}}
145+
146+
To preserve order when reading BSON, use the :class:`~bson.son.SON` class,
147+
which is a dict that remembers its key order. Now, documents and subdocuments
148+
in query results are represented with :class:`~bson.son.SON` objects:
149+
150+
.. doctest:: key-order
151+
152+
>>> from bson.son import SON
153+
>>> print collection.find_one(as_class=SON)
154+
SON([(u'_id', 1.0), (u'subdocument', SON([(u'b', 1.0), (u'a', 1.0)]))])
155+
156+
The subdocument's actual storage layout is now visible: "b" is before "a".
157+
158+
Because a dict's key order is not defined, you cannot predict how it will be
159+
serialized **to** BSON. But MongoDB considers subdocuments equal only if their
160+
keys have the same order. So if you use a dict to query on a subdocument it may
161+
not match:
162+
163+
.. doctest:: key-order
164+
165+
>>> collection.find_one({'subdocument': {'a': 1.0, 'b': 1.0}}) is None
166+
True
167+
168+
Swapping the key order in your query makes no difference:
169+
170+
.. doctest:: key-order
171+
172+
>>> collection.find_one({'subdocument': {'b': 1.0, 'a': 1.0}}) is None
173+
True
174+
175+
... because, as we saw above, Python considers the two dicts the same.
176+
177+
There are two solutions. First, you can match the subdocument field-by-field:
178+
179+
.. doctest:: key-order
180+
181+
>>> collection.find_one({'subdocument.a': 1.0,
182+
... 'subdocument.b': 1.0})
183+
{u'_id': 1.0, u'subdocument': {u'a': 1.0, u'b': 1.0}}
184+
185+
The query matches any subdocument with an "a" of 1.0 and a "b" of 1.0,
186+
regardless of the order you specify them in Python or the order they are stored
187+
in BSON. Additionally, this query now matches subdocuments with additional
188+
keys besides "a" and "b", whereas the previous query required an exact match.
189+
190+
The second solution is to use a :class:`~bson.son.SON` to specify the key order:
191+
192+
.. doctest:: key-order
193+
194+
>>> query = {'subdocument': SON([('b', 1.0), ('a', 1.0)])}
195+
>>> collection.find_one(query)
196+
{u'_id': 1.0, u'subdocument': {u'a': 1.0, u'b': 1.0}}
197+
198+
The key order you use when you create a :class:`~bson.son.SON` is preserved
199+
when it is serialized to BSON and used as a query. Thus you can create a
200+
subdocument that exactly matches the subdocument in the collection.
201+
202+
.. seealso:: `MongoDB Manual entry on subdocument matching
203+
<http://docs.mongodb.org/manual/tutorial/query-documents/#embedded-documents>`_.
204+
101205
What does *CursorNotFound* cursor id not valid at server mean?
102206
--------------------------------------------------------------
103207
Cursors in MongoDB can timeout on the server if they've been open for

0 commit comments

Comments
 (0)