Skip to content

Proposal to Add Interface to Create and Retrieve CompositeKeys #180

@harish876

Description

@harish876

Background

Currently, secondary indexing is offloaded to an external datastore by exporting all chain data to an external source. This introduces several moving parts, including a utility like Python Cache.

Although the chain is in-memory, every client issues a GetAllBlocks request. This retrieves the entire blockchain state, pulling all the data. While modifying this structure could improve overall blockchain state retrieval, that is not our current focus.

Goal

Instead, our goal is to improve read latencies for queries based on secondary attribute lookups from the storage layer, which is the most common use case for applications. These would be for applications using the storage engine as a document store, i.e, storing values as a JSON object or otherwise as a simple key-value store. A few applications that do this, that are part of this release, are:

  1. ResLens - Uses ResDB as a simple KV store
  2. ResCanvas - Uses ResDB as a document store
  3. Consensus - Used ResDB as a document store

Problem Statement

Currently, if an application needs to perform a lookup based on a secondary attribute, the following steps must be taken.

  • Hit this endpoint GetAllValues
  • Then manually apply filtering logic in memory at the application layer.
  • The alternative solution is to export all the data to an external datastore and sync it constantly, and apply filtering logic on it.

Proposed Solution

Composite Keys are a great way to add indexing support on non-primary attributes. They allow for indexing by a single field, multiple fields, and converging indexes for different workloads. They are used widely for this use case, for example, MySQL's MyRocks storage engine.

They are also used by Hyperledger blockchain as a lightweight indexing mechanism atop a Key-Value store Composite keys in Hyperledger.

We can leverage LevelDB or RocksDB's BlobDB, the latter of which reduces write amplification on large value pairs, a common issue with applications that use ResilientDB as a document store.

Technical Details

Interface Details

Design Details

Scope and Assumptions

  1. Transaction Management: Atomic API's to Create Composite Keys and Update Composite Keys will be provided to the end user. The onus will be on the user to call these API's from the application layer or an SDK, which will wrap these 2 operations in a transaction. Transaction management, thus, would be an application-level construct.

The storage layer does not guarantee index consistency if composite key creation is not wrapped in a transaction. Applications are expected to use SDK-level transaction helpers to ensure atomicity.

  1. Usage: Instead of providing enterprise features to create an index on a dataset, users have to create on-demand indexes. This is truly up to the user on how they decide to create composite keys. These low-level constructs promote these constructs:
  • Gas Fee Application at a transaction level.
  • Index / Composite Key creation will be a registered operation on the ledger.
  1. Update Queries: Currently, the SET query refers to a PUT where a SET on an existing key updates the key-value pair with the new value. We provide an UpdateCompositeKey API, which will update the old composite key with the new value. This operation involves the following steps:
    a. Check for the existing composite key. If the key is absent, then return.
    b. If the old composite key exists, then delete the old key.
    c. Construct the composite key with the new value and insert it.
    d. Transaction management for steps b and c, so that both these operations constitute 1 single atomic operation.

Alternatives Considered

  1. We implemented a multi-tenant solution with custom modifications to the LSM engine to attach one global secondary index. Details for this can be found down below:
  • Implementation
  • Presentation
  • This implementation was rejected due to the compatibility of multi-tenancy with the current blockchain design.
  1. Our current solution involves syncing the blockchain state with an external data source like MongoDB and applying filtering queries atop MongoDB. Well, this is a good design. We observed a lot of maintenance and assistance for new application developers to get started with setting up 2 different databases, maintaining 2 different constructs of querying one state of data.

Failure Modes & Consistency

  1. What happens if a Put succeeds but CreateCompositeKey fails?
    Transaction management for all operation-specific API's will be pushed to the application. Thus, each operation carries an overhead of 1 network call. Thus, an application-level construct like an SDK will be responsible for ensuring that if a PUT query fails, then CompositeKey creation is failed.

Performance Considerations

  1. Current Approach

    • Secondary lookups require a full O(n) scan across all records.
    • In practice, this also includes the cost of syncing data to MongoDB and issuing queries there, further increasing latency.
  2. Composite Key Approach

    • Lookup cost is O(log n) to position the iterator via prefix search.
    • Retrieval cost is O(M), where M is the number of matching results.
    • Overall complexity: O(log n + M), a significant improvement over O(n) for large datasets.
  3. Why does Prefix Search work in KV stores?

Role of Bloom Filters in Prefix Searches

Bloom filters complement composite key lookups by reducing unnecessary I/O during prefix queries:

  1. File Pruning for Seeks

    • For a prefix search, the initial Seek(prefix) must locate the first matching key across multiple SST files.
    • Bloom filters allow LevelDB/RocksDB to skip SST files that cannot possibly contain keys with the given prefix, avoiding wasted binary searches.
  2. Faster Lookup Startup

    • The O(log n) seek is accelerated since only candidate files are probed.
    • This is especially valuable in large datasets with many SST files spread across levels.
  3. Reduced I/O Amplification

    • Queries with narrow prefixes benefit the most: Bloom filters prevent LevelDB / RocksDB from scanning irrelevant files.
    • Once the iterator is positioned, sequential retrieval of matches (O(M)) proceeds as normal and does not use Bloom filters.

Summary:
Bloom filters do not optimize the sequential scan phase of a prefix search, but they significantly speed up the initial seek by pruning SST files. This ensures that prefix lookups retain O(log n + M) complexity rather than approaching O(n) in practice.

Migration / Backwards Compatibility

  • Data Compatibility:
    Existing data remains untouched. Composite keys are stored as additional key-value entries, so no schema or data migration is required.

  • API Compatibility:
    Existing APIs (Put, Get, GetAllValues) continue to work as before. Composite key APIs (CreateCompositeKey, GetByCompositeKey, UpdateCompositeKey) are additive and optional.

  • Application Migration:
    Applications can adopt composite keys incrementally. They may continue to rely on client-side filtering or external databases, and gradually replace those with composite key lookups as needed.

  • Optional Adoption:
    No changes are required for current users. Only applications that explicitly invoke the new APIs will see the benefits.

Operational Concerns

  • As discussed, the Gas Fee at a transaction level is an actively discussed feature requirement. Thus, users currently don't get levied a fee on any of the existing operations or our newly proposed operations. With our current design, however, the Gas Fee application can be done since the components for index management have been extracted into individual operations, and thus a Gas Fee on every operation can be applied.

Preliminary Results:

Environment Details

Running /home/ubuntu/.cache/bazel/_bazel_ubuntu/4daa26ab31ab249b7607bfe58eb14371/execroot/com_resdb_nexres/bazel-out/k8-fastbuild/bin/benchmark/storage/composite_key_benchmark
Run on (8 X 3600.53 MHz CPUs)
CPU Caches:
  L1 Data 32 KiB (x4)
  L1 Instruction 32 KiB (x4)
  L2 Unified 1024 KiB (x4)
  L3 Unified 36608 KiB (x1)
Load Average: 2.07, 0.97, 0.48
***WARNING*** Library was built as DEBUG. Timings may be affected.
Records Filter Type Current Solution Composite Key Solution Perceived Speedup
1,000 String 43.7 ms 6.34 ms 6.9×
10,000 String 472 ms 73.1 ms 6.5×
100,000 String 5159 ms 745 ms 6.9×
1,000 Integer 43.3 ms 6.36 ms 6.8×
10,000 Integer 471 ms 69.7 ms 6.8×
100,000 Integer 5132 ms 732 ms 7.0×

Advantages of our Solution

  • Improved Read Latency: Enables faster lookups by indexing secondary attributes directly in the storage layer.

  • Reduces Application Complexity: Eliminates the need for client-side filtering logic or full data scans.

  • No External Sync Required: Removes dependency on external databases and continuous export/sync pipelines.

  • Lightweight Implementation: Composite keys require minimal overhead and can be implemented without significant architectural changes.

  • Supports Richer Queries: Enables filtering and retrieval by multiple fields or field combinations (converged indexes).

  • Built on Proven Techniques: Uses battle-tested patterns from systems like MyRocks (MySQL) and Hyperledger Fabric.

  • Document Store Friendly: Optimized for use cases where values are stored as JSON or large blobs, especially with RocksDB’s BlobDB.

  • Scalable Design: Can handle high write and read throughput while preserving query efficiency.

  • Expands Use Cases: Unlocks new classes of applications like dashboards, real-time analytics, and search-backed services.

  • Easy Adoption: Current Applications do not need to change their code. All these features are add-ons and can be added without deleting, modifying, or migrating data.

Disadvantages of our Solution

  • Increased Write Amplification: Any external index needs extra space; this is the classic space vs time conundrum in CS. We can limit applications to the number of composite keys they can create. Creating the actual keys takes up very little space. These composite keys are just heap pointers and are by default non-clustered indexes.

  • Increased App Complexity: We do not automatically create composite keys; rather ask the developer to build it. This acts just like inserting yet another key-value pair. Now the write becomes non-atomic, but this can be handled by the application, where first the process of inserting the data and creating the index is wrapped in a transaction.

Future Directions / Ideas

  • Currently, the in-memory blockchain state is maintained in a C++ unordered_map container. The use of an in-memory database to manage blockchain state will allow for more robust querying and maintenance of an in-memory chain state.

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions