Skip to content

Mongoengine is very slow on large documents compared to native pymongo usage #1230

Open
@baruchoxman

Description

@baruchoxman

(See also this StackOverflow question)

I have the following mongoengine model:

class MyModel(Document):
    date = DateTimeField(required = True)
    data_dict_1 = DictField(required = False)
    data_dict_2 = DictField(required = True)

In some cases the document in the DB can be very large (around 5-10MB), and the data_dict fields contain complex nested documents (dict of lists of dicts, etc...).

I have encountered two (possibly related) issues:

  1. When I run native pymongo find_one() query, it returns within a second. When I run MyModel.objects.first() it takes 5-10 seconds.
  2. When I query a single large document from the DB, and then access its field, it takes 10-20 seconds just to do the following:
    m = MyModel.objects.first()
    val = m.data_dict_1.get(some_key)

The data in the object does not contain any references to any other objects, so it is not an issue of objects dereferencing.
I suspect it is related to some inefficiency of the internal data representation of mongoengine, which affects the document object construction as well as fields access.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions