This project implements a client-server architecture for an LSM-Tree (Log-Structured Merge-Tree).
-
include/
: Header filesbloom_filter.h
: Bloom filter implementation for efficient lookupsclient.h
: Client class definitionconstants.h
: System-wide constants and DSL definitionsfence_pointers.h
: Fence pointers for run indexinglsm_adapter.h
: LSM-Tree adapter interfacelsm_tree.h
: Core LSM-Tree implementationrun.h
: Run management and operationsserver.h
: Server class definitionskip_list.h
: Skip list implementation for memory bufferthread_pool.h
: Thread pool implementation
-
src/
: Source filesbloom_filter.cpp
: Bloom filter implementationclient.cpp
: Client implementationdata_generator.cpp
: Test data generation utilitiesdata_generator_256mb.cpp
: Large dataset generatorfence_pointers.cpp
: Fence pointer implementationlsm_adapter.cpp
: LSM-Tree adapter implementationlsm_tree.cpp
: Core LSM-Tree functionalitymain_client.cpp
: Client entry pointmain_server.cpp
: Server entry pointrun.cpp
: Run operations implementationserver.cpp
: Server implementation with command processingskip_list.cpp
: Skip list implementationthread_pool.cpp
: Thread pool implementationalmost_full_buffer_generator.cpp
: Buffer testing utilitygenerate_test_data.cpp
: Test data generation tools
To build all components of the project, run:
make
This will create the following executables in the bin/
directory:
server
: The LSM-Tree serverclient
: A client to interact with the servergenerate_test_data
: Utility for generating test data with different sizes and distributionsdata_generator
: Utility for generating 10GB test datadata_generator_256mb
: Utility for generating 256MB test dataalmost_full_buffer_generator
: Utility for testing buffer management
To start the server with the default port (9090):
make run-server
Or specify a custom port:
./bin/server 9091
To run a client and connect to a local server:
make run-client
Or specify a custom host and port:
./bin/client 192.168.1.100 9091
The project includes several utilities for generating test data:
- Generate test data with specific size and distribution:
make generate-data
- Generate a comprehensive test dataset:
make generate-test-data-all
This creates multiple files with different sizes (100MB, 256MB, 512MB, 1024MB) and distributions (uniform, skewed).
- Generate large datasets:
make generate-10gb # Generates 10GB test data
make generate-256mb # Generates 256MB test data
- Generate data for buffer testing:
make generate-almost-full
Run performance tests across all dimensions:
make performance-test
Run performance test for a specific dimension:
make performance-test-dim DIMENSION=data_size
The LSM-Tree supports the following commands:
Insert or update a key-value pair in the tree.
p [key] [value]
Example: p 10 7
– Stores key 10 with value 7
Retrieve a value for a given key.
g [key]
Example: g 10
– Gets the value associated with key 10
Retrieve all key-value pairs within a range (inclusive start, exclusive end).
r [start] [end]
Example: r 10 20
– Gets all key-value pairs with keys from 10 to 19
Remove a key-value pair from the tree.
d [key]
Example: d 10
– Deletes the entry with key 10
Load key-value pairs from a binary file.
l "[filepath]"
Example: l "/path/to/file.bin"
– Loads pairs from the specified file
Print statistics about the current state of the tree.
s
Display help information about available commands.
h
Disconnect from the server.
q
-
Server: Listens on a specified port for client connections
- Supports up to 64 concurrent clients
- Uses a thread pool to process client commands in parallel
- Command processing with strict validation
-
Client: Connects to the server and provides a command interface
- DSL for interacting with the LSM-Tree
- Error handling and command validation
- Command-response pattern
-
Thread Pool: Enables parallel processing of client commands
- Worker threads take tasks from a queue
- Asynchronous task completion with futures