- 
                Notifications
    You must be signed in to change notification settings 
- Fork 107
Open
Labels
featureNew feature or requestNew feature or request
Description
Is your feature request related to a problem? Please describe.
lmdb is a memory-mapped database format that provides random access to individual training events faster than SQLite.
The main benefits of lmdb: Faster query speeds, lower storage footprint, serialized data allows for an elegant implementation of storing arbitrary DataRepresentations (see #781).
 
 
For larger datasets, and particularly large events, SQLite can become prohibitive as a dataset format.
Describe the solution you'd like
- A LMDBWriterthat outputs events inlmdbformat. Should accept a serialization method (msgpack, for example). Should optionally accept a DataRepresentation - if given, representations are pre-computed and serialized usingdillor similar pickle methods (see Graph construction before training #781). A field in the database should contain relevant information regarding the serializer, such that the file is a self-contained object that users can read without prior knowledge of the serializer used.
- A LMDBDatasetthat is compatible with the lmdb database format. Should automatically check for which serializer was used, so the user doesn't have to guess. Should be able to retrieve pre-computed data representations.
Metadata
Metadata
Assignees
Labels
featureNew feature or requestNew feature or request