diff --git a/.github/workflows/doc.yml b/.github/workflows/doc.yml index fe396ac1..3a5b857f 100644 --- a/.github/workflows/doc.yml +++ b/.github/workflows/doc.yml @@ -21,6 +21,7 @@ jobs: cd guide curl -L https://github.com/rust-lang/mdBook/releases/download/v0.4.28/mdbook-v0.4.28-x86_64-unknown-linux-gnu.tar.gz | tar xzf - echo $PWD >> $GITHUB_PATH + cargo install mdbook-toc ./mdbook build - name: Deploy uses: JamesIves/github-pages-deploy-action@v4 diff --git a/.gitignore b/.gitignore index 2815f2b3..a2ac2318 100644 --- a/.gitignore +++ b/.gitignore @@ -1,6 +1,8 @@ -/target +target /.idea Cargo.lock db_path -bindings/python/target -guide/book \ No newline at end of file +guide/book +__pycache__ + +.DS_Store diff --git a/guide/book.toml b/guide/book.toml index 98486e8f..947c6617 100644 --- a/guide/book.toml +++ b/guide/book.toml @@ -7,5 +7,10 @@ title = "The Tonbo Guide" [output.html] git-repository-url = "https://github.com/tonbo-io/tonbo" +mathjax-support = true [output.html.playground] runnable = false + +[preprocessor.toc] +command = "mdbook-toc" +renderer = ["html"] diff --git a/guide/src/SUMMARY.md b/guide/src/SUMMARY.md index 90921bcf..eff925db 100644 --- a/guide/src/SUMMARY.md +++ b/guide/src/SUMMARY.md @@ -2,10 +2,22 @@ [Introduction](./introduction.md) -- [Getting started](./start.md) +- [Getting Started](./start.md) +- [Usage]() + - [Tonbo](./usage/tonbo.md) + - [Python Binding](./usage/python.md) + - [JavaScript Binding](./usage/wasm.md) + - [Configuration](./usage/conf.md) + - [Advance](./usage/advance.md) + - [FAQ](./usage/faq.md) - [Examples](./examples/index.md) - [Using Tonbo](./examples/declare.md) - [Integrate with Datafusio](./examples/datafusion.md) - [Using under Wasm](./examples/wasm.md) -- [Contribution](./contribution/index.md) +- [Contribution]() - [Building](./contribution/build.md) + - [Submitting PR](./contribution/pr.md) +- [TonboLite](./tonbolite/index.md) + - [Getting Started](./tonbolite/start.md) + - [Building and Testing](./tonbolite/build.md) + - [Usage](./tonbolite/usage.md) diff --git a/guide/src/contribution/build.md b/guide/src/contribution/build.md index 6fef9612..9aacf199 100644 --- a/guide/src/contribution/build.md +++ b/guide/src/contribution/build.md @@ -1,3 +1,93 @@ -# Building Tonbo +# Building and Testing + -TODO +To get started using tonbo you should make sure you have [Rust](https://www.rust-lang.org/tools/install) installed on your system. If you haven't alreadly done yet, try following the instructions [here](https://www.rust-lang.org/tools/install). + +## Building and Testing for Rust + +### Building and Testing with Non-WASM + +To use local disk as storage backend, you should import [tokio](https://github.com/tokio-rs/tokio) crate and enable "tokio" feature (enabled by default) + +```bash +cargo build +``` + +If you build Tonbo successfully, you can run the tests with: + +```bash +cargo test +``` + +### Building and Testing with WASM + +If you want to build tonbo under wasm, you should add wasm32-unknown-unknown target first. + +```bash +# add wasm32-unknown-unknown target +rustup target add wasm32-unknown-unknown +# build under wasm +cargo build --target wasm32-unknown-unknown --no-default-features --features wasm +``` + +Before running the tests, make sure you have installed [wasm-pack](https://github.com/rustwasm/wasm-pack) and run `wasm-pack build` to build the wasm module. If you build successfully, you can run the tests with: + +```bash +wasm-pack test --chrome --headless --test wasm --no-default-features --features aws,bytes,opfs +``` + + +## Building and Testing for Python + +### Building +We use the [pyo3](https://github.com/PyO3/pyo3) to generate a native Python module and use [maturin](https://github.com/PyO3/maturin) to build Rust-based Python packages. + +First, follow the commands below to build a new Python virtualenv, and install maturin into the virtualenv using Python's package manager, pip: + +```bash +# setup virtualenv +python -m venv .env +# activate venv +source .env/bin/activate + +# install maturin +pip install maturin +# build bindings +maturin develop + +``` + +Whenever Rust code changes run: + +```bash +maturin develop +``` + +### Testing + +If you want to run tests, you need to build with "test" options: + +```base +maturin develop -E test +``` + +After building successfully, you can run the tests with: + +```bash +# run tests except benchmarks(This need duckdb to be installed) +pytest --ignore=tests/bench -v . + +# run all tests +pip install duckdb +python -m pytest +``` + +## Building and Testing for JavaScript +To build tonbo for JavaScript, you should install [wasm-pack](https://github.com/rustwasm/wasm-pack). If you haven't already done so, try following the instructions [here](https://rustwasm.github.io/wasm-pack/installer/). + +```bash +# add wasm32-unknown-unknown target +rustup target add wasm32-unknown-unknown +# build under wasm +wasm-pack build --target web +``` diff --git a/guide/src/contribution/pr.md b/guide/src/contribution/pr.md new file mode 100644 index 00000000..41e66041 --- /dev/null +++ b/guide/src/contribution/pr.md @@ -0,0 +1,36 @@ +# Submitting a Pull Request + +Thanks for your contribution! The Tonbo project welcomes contribution of various types -- new features, bug fixes and reports, typo fixes, etc. If you want to contribute to the Tonbo project, you will need to pass necessary checks. If you have any question, feel free to start a new discussion or issue, or ask in the Tonbo [Discord](https://discord.gg/j27XVFVmJM). + +## Running Tests and Checks +This is a Rust project, so [rustup](https://rustup.rs/) and [cargo](https://doc.rust-lang.org/cargo/) are the best place to start. + +- `cargo check` to analyze the current package and report errors. +- `cargo +nightly fmt` to format the current code. +- `cargo build` to compile the current package. +- `cargo clippy` to catch common mistakes and improve code. +- `cargo test` to run unit tests. +- `cargo bench` to run benchmark tests. + + +> **Note**: If you have any changes to *bindings/python*, please make sure to run checks and tests before submitting your PR. If you don not know how to build and run tests, please refer to the [Building Tonbo for Python](./build.md#building-tonbo-for-python) section. + +## Pull Request title +As described in [here](https://gist.github.com/joshbuchea/6f47e86d2510bce28f8e7f42ae84c716), a valid PR title should begin with one of the following prefixes: +- feat: new feature for the user, not a new feature for build script +- fix: bug fix for the user, not a fix to a build script +- doc: changes to the documentation +- style: formatting, missing semi colons, etc; no production code change +- refactor: refactoring production code, eg. renaming a variable +- test: adding missing tests, refactoring tests; no production code change +- chore: updating grunt tasks etc; no production code change + +Here is an example of a valid PR title: +``` +feat: add float type +^--^ ^------------^ +| | +| +-> Summary in present tense. +| ++-------> Type: chore, docs, feat, fix, refactor, style, or test. +``` diff --git a/guide/src/contribution/testing.md b/guide/src/contribution/testing.md new file mode 100644 index 00000000..8c1325bd --- /dev/null +++ b/guide/src/contribution/testing.md @@ -0,0 +1,6 @@ +# Testing + + +## Testing Tonbo in Rust + +## Testing Tonbo in WASM diff --git a/guide/src/start.md b/guide/src/start.md index 561553ef..29b29475 100644 --- a/guide/src/start.md +++ b/guide/src/start.md @@ -1,21 +1,32 @@ +# Getting started + + ## Installation -To get started using tonbo you should make sure you have Rust installed on your system. If you haven't alreadly done yet, try following the instructions [here](https://www.rust-lang.org/tools/install). +### Prerequisite +To get started using tonbo you should make sure you have [Rust](https://www.rust-lang.org/tools/install) installed on your system. If you haven't already done yet, try following the instructions [here](https://www.rust-lang.org/tools/install). +### Installation -## Adding dependencies +To use local disk as storage backend, you should import [tokio](https://github.com/tokio-rs/tokio) crate and enable "tokio" feature (enabled by default) in the *Cargo.toml* file. ```toml -fusio = { git = "https://github.com/tonbo-io/fusio.git", rev = "216eb446fb0a0c6e5e85bfac51a6f6ed8e5ed606", package = "fusio", version = "0.3.3", features = [ - "dyn", - "fs", -] } tokio = { version = "1", features = ["full"] } tonbo = { git = "https://github.com/tonbo-io/tonbo" } ``` -## Defining Schema +If you want to use tonbo in browser(use OPFS as storage backend), you should disable "*tokio*" feature and enable "*wasm*" feature(As "*tokio*" is enabled by default, you should also disable `default-features`). If you want to use S3 as backend, you also should enable "*wasm-http*" feature. + +```toml +tonbo = { git = "https://github.com/tonbo-io/tonbo", default-features = false, features = [ + "wasm", + "wasm-http", +] } +``` +## Using Tonbo + +### Defining Schema -You can use `Record` macro to define schema of column family just like ORM. Tonbo will generate all relevant files for you at compile time. +Tonbo provides ORM-like macro for ease of use, you can use `Record` macro to define schema of column family. Tonbo will generate all relevant code for you at compile time. ```rust use tonbo::Record; @@ -26,7 +37,6 @@ pub struct User { name: String, email: Option, age: u8, - bytes: Bytes, } ``` @@ -41,7 +51,7 @@ Now, Tonbo support these types: - String type: `String` - Bytes: `bytes::Bytes` -## Create DB +### Creating database After define you schema, you can create `DB` with a customized `DbOption` @@ -65,23 +75,48 @@ async fn main() { } ``` -`UserSchema` is a struct that tonbo generates for you in the compile time, so you do not need to import it. +`UserSchema` is a struct that tonbo generates for you in the compile time, so you do not need to import it. One thing you need to pay attention to is: you should **make sure the path exists** before creating `DBOption`. + +> **Note:** If you use tonbo in WASM, you should use `Path::from_opfs_path` rather than `Path::from_filesystem_path`. +> + +### Operations on Database -## Read/Write data +After creating `DB`, you can execute `insert`, `remove`, `get` and other operations now. But remember that you will get a **`UserRef` instance** that implements `RecordRef` trait rather than the `User`, if you get record from tonbo. This is a struct that tonbo generates for you in the compile time. It may look like: + +```rust +#[derive(Debug, PartialEq, Eq, Clone, Copy)] +pub struct UserRef<'r> { + pub name: &'r str, + pub email: Option<&'r str>, + pub age: Option, +} +impl RecordRef for UserRef<'_> { + // ...... +} +``` + +### Insert + +`DB::insert` receives a `Record` instance which is the instance of struct you defined with `#[derive(Record)]`. + +```rust +db.insert(User { /* ... */ }).await.unwrap(); +``` + +### Remove +`DB::remove` receives a `Key` which is the type of `#[record(primary_key)]`. This method will remove the record that specified by the given `Key`. + +```rust +db.remove("Alice".into()).await.unwrap(); +``` -After create `DB`, you can execute `insert`, `remove`, `get` now. But remember that you will get a `UserRef` object rather than the `User`, if you get record from tonbo. This is a struct that tonbo generates for you in the compile time. +### Get +`DB::get` receives a `Key` and process the record with a closure that receives a `TransactionEntry`. You can use `TransactionEntry::get` to get the record which is the type of `RecordRef`. ```rust -db.insert(User { - name: "Alice".into(), - email: Some("alice@gmail.com".into()), - age: 22, -}) -.await -.unwrap(); - -let age = db - .get(&"Alice".into(), |entry| { +let age = db.get(&"Alice".into(), + |entry| { // entry.get() will get a `UserRef` let user = entry.get(); println!("{:#?}", user); @@ -89,78 +124,121 @@ let age = db }) .await .unwrap(); -assert!(age.is_some()); -assert_eq!(age, Some(22)); ``` +### Scan +Like `DB::get`, `DB::scan` receives a closure that process `TransactionEntry`. The difference is that `DB::scan` receives a range of `Key`s and process all data that satisfied with the closure. -## Using transaction +```rust +let lower = "Alice".into(); +let upper = "Bob".into(); +let stream = db + .scan( + (Bound::Included(&lower), Bound::Excluded(&upper)), + |entry| { + let record_ref = entry.get(); + + record_ref.age + }, + ) + .await; +let mut stream = std::pin::pin!(stream); +while let Some(data) = stream.next().await.transpose().unwrap() { + // ... +} +``` + +#### Using transaction Tonbo supports transaction. You can also push down filter, limit and projection operators in query. ```rust +// create transaction let txn = db.transaction().await; -// get from primary key let name = "Alice".into(); -// get the zero-copy reference of record without any allocations. +txn.insert(User { /* ... */ }); let user = txn.get(&name, Projection::All).await.unwrap(); let upper = "Blob".into(); // range scan of user let mut scan = txn .scan((Bound::Included(&name), Bound::Excluded(&upper))) - // tonbo supports pushing down projection - .projection(vec![1]) - // push down limitation - .limit(1) .take() .await .unwrap(); while let Some(entry) = scan.next().await.transpose().unwrap() { - assert_eq!( - entry.value(), - Some(UserRef { - name: "Alice", - email: Some("alice@gmail.com"), - age: None, - }) - ); + let data = entry.value(); // type of UserRef + // ...... } ``` -## Using S3 backends +### Persistence +As Tonbo uses LSM(Log-Structured-Merge Tree) as the underlying data structure, some data are in the memory(mem). If you want to persist these data, you can use the `flush` method. -Tonbo supports various storage backends, such as OPFS, S3, and maybe more in the future. You can use `DbOption::level_path` to specify which backend to use. +If WAL is enabled, the data will be persisted to disk automatically. But as tonbo has buffer for WAL by default, you need to call `flush_wal` method if you want to ensure that all the data will be recovered. If you don not want to use buffer for WAL, you can disable it by setting `wal_buffer_size` to 0. -For local storage, you can use `FsOptions::Local` as the parameter. And you can use `FsOptions::S3` for S3 storage. After create `DB`, you can then operator it like normal. +```rust +let options = DbOption::new( + Path::from_filesystem_path("./db_path/users").unwrap(), + &UserSchema, +).wal_buffer_size(0); +``` +If you don't want to use WAL, you can disable it by setting the `DbOption::disable_wal`. But please ensure that losing data is acceptable for you. ```rust -use fusio::{path::Path, remotes::aws::AwsCredential}; -use fusio_dispatch::FsOptions; -use tonbo::{executor::tokio::TokioExecutor, DbOption, DB}; +let options = DbOption::new( + Path::from_filesystem_path("./db_path/users").unwrap(), + &UserSchema, +).disable_wal(true); +``` -#[tokio::main] -async fn main() { - let fs_option = FsOptions::S3 { - bucket: "wasm-data".to_string(), - credential: Some(AwsCredential { - key_id: "key_id".to_string(), - secret_key: "secret_key".to_string(), - token: None, - }), - endpoint: None, - sign_payload: None, - checksum: None, - region: Some("region".to_string()), - }; - - let options = DbOption::new(Path::from_filesystem_path("s3_path").unwrap(), &UserSchema) - .level_path(2, "l2", fs_option); +> **Note**: If you disable WAL, there is nothing to do with `flush_wal`. You need to call `flush` method to persist the memory data. +> +> If you enable WAL and set `wal_buffer_size` to 0, you do not need to call `flush_wal` method, since WAL will be flushed to disk before writing. - let db = DB::::new(options, TokioExecutor::current(), UserSchema) - .await - .unwrap(); -} +### Using in S3 + +If you want to use Tonbo in S3, you can configure `DbOption` to specify which part of the data to store in S3 and which part to store in local disk. Here is an example: + +```rust +let s3_option = FsOptions::S3 { + bucket: "bucket".to_string(), + credential: Some(AwsCredential { + key_id: "key_id".to_string(), + secret_key: "secret_key".to_string(), + token: None, + }), + endpoint: None, + sign_payload: None, + checksum: None, + region: Some("region".to_string()), +}; +let options = DbOption::new( + Path::from_filesystem_path("./db_path/users").unwrap(), + &UserSchema, +).level_path(2, "l2", s3_option.clone()) +).level_path(3, "l3", s3_option); ``` + +In this example, the data of level 2 and level 3 will be stored in S3 and the rest of the data will be stored in local disk. If there are data in level 2 and level 3, you can find them in S3 like this: + +```bash +s3://bucket/l2/ +├── xxx.parquet +├── ...... +s3://bucket/l3/ +├── xxx.parquet +├── ...... +``` + +For more configuration options, please refer to the [Configuration](./usage/conf.md) section. + +## What next? +- To learn more about tonbo in Rust or in WASM, you can refer to [Tonbo API](./usage/tonbo.md) +- To use tonbo in python, you can refer to [Python API](./usage/python.md) +- To learn more about tonbo in brower, you can refer to [WASM API](./usage/wasm.md) +- To learn more configuration about tonbo, you can refer to [Configuration](./usage/conf.md) +- There are some data structures for runtime schema, you can use them to [expole tonbo](./usage/advance.md). You can also refer to our [python](https://github.com/tonbo-io/tonbo/tree/main/bindings/python), [wasm](https://github.com/tonbo-io/tonbo/tree/main/bindings/js) bindings and [Tonbolite(a SQLite extension)](https://github.com/tonbo-io/tonbolite) +- To learn more about tonbo by examples, you can refer to [examples](https://github.com/tonbo-io/tonbo/tree/main/examples) diff --git a/guide/src/tonbolite/build.md b/guide/src/tonbolite/build.md new file mode 100644 index 00000000..f1794679 --- /dev/null +++ b/guide/src/tonbolite/build.md @@ -0,0 +1,44 @@ +# Building TonboLite + +### Build as Extension +To build TonboLite as an extension, you should enable loadable_extension features +```sh +cargo build --release --features loadable_extension +``` +Once building successfully, you will get a file named libsqlite_tonbo.dylib(`.dll` on windows, `.so` on most other unixes) in *target/release/* +### Build on Rust + +```sh +cargo build +``` + +### Build on Wasm + +To use TonboLite in wasm, it takes a few steps to build. +1. Add wasm32-unknown-unknown target +```sh +rustup target add wasm32-unknown-unknown +``` +2. Override toolchain with nightly +```sh +rustup override set nightly +``` +3. Build with [wasm-pack](https://github.com/rustwasm/wasm-pack) +```sh +wasm-pack build --target web --no-default-features --features wasm +``` + +Once you build successfully, you will get a *pkg* folder containing compiled js and wasm files. Copy it to your project and then you can start to use it. +```js +const tonbo = await import("./pkg/sqlite_tonbo.js"); +await tonbo.default(); + +// start to use TonboLite ... +``` + + +
+ +TonboLite should be used in a [secure context](https://developer.mozilla.org/en-US/docs/Web/Security/Secure_Contexts) and [cross-origin isolated](https://developer.mozilla.org/en-US/docs/Web/API/Window/crossOriginIsolated), since it uses [`SharedArrayBuffer`](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/SharedArrayBuffer) to share memory. Please refer to [this article](https://web.dev/articles/coop-coep) for a detailed explanation. + +
diff --git a/guide/src/tonbolite/index.md b/guide/src/tonbolite/index.md new file mode 100644 index 00000000..b1c3e6f6 --- /dev/null +++ b/guide/src/tonbolite/index.md @@ -0,0 +1,3 @@ +# TonboLite + +TonboLite is a WASM compatible SQLite extension that allows users to create tables which supports analytical processing directly in SQLite. Its storage engine is powered by our open-source embedded key-value database, [Tonbo](https://github.com/tonbo-io/tonbo). diff --git a/guide/src/tonbolite/start.md b/guide/src/tonbolite/start.md new file mode 100644 index 00000000..752d0b8e --- /dev/null +++ b/guide/src/tonbolite/start.md @@ -0,0 +1,110 @@ +# Getting Started + + +## Installation + +### Prerequisite +To get started using tonbo you should make sure you have [Rust](https://www.rust-lang.org/tools/install) installed on your system. If you haven't alreadly done yet, try following the instructions [here](https://www.rust-lang.org/tools/install). + +### Building + +To build TonboLite as an extension, you should enable loadable_extension features + +```sh +cargo build --release --features loadable_extension +``` + +Once building successfully, you will get a file named libsqlite_tonbo.dylib(`.dll` on windows, `.so` on most other unixes) in *target/release/* + +```bash +target/release/ +├── build +├── deps +├── incremental +├── libsqlite_tonbo.d +├── libsqlite_tonbo.dylib +└── libsqlite_tonbo.rlib +``` + +## Loading TonboLite + +SQLite provide [`.load`](https://www.sqlite.org/cli.html#loading_extensions) command to load a SQLite extension. So, you can load TonboLite extension by running the following command: + +```bash +.load target/release/libsqlite_tonbo +``` + +## Creating Table + +After loading TonboLite extension successfully, you can [SQLite Virtual Table](https://www.sqlite.org/vtab.html) syntax to create a table: + +```sql +CREATE VIRTUAL TABLE temp.tonbo USING tonbo( + create_sql = 'create table tonbo(id bigint primary key, name varchar, like int)', + path = 'db_path/tonbo' +); +``` +- `create_sql` is a SQL statement that will be executed to create the table. +- `path` is the path to the database file. + +## Inserting Data + +After creating a table, you can start to insert data into it using the normal `INSERT INTO` statement: + +```sql +INSERT INTO tonbo(id, name, like) VALUES(1, 'tonbo', 100); +``` + +## Querying Data + +After inserting data, you can query them by using the `SELECT` statement: + +```sql +SELECT * FROM tonbo; + +1|tonbo|100 +``` + +## Updating Data + +You can update data in the table using the `UPDATE` statement: + +```sql +UPDATE tonbo SET like = 123 WHERE id = 1; + +SELECT * FROM tonbo; +1|tonbo|123 +``` + +## Deleting Data + +You can also delete data by using the `DELETE` statement: + +```sql +DELETE FROM tonbo WHERE id = 1; +``` + +## Coding with extension + +TonboLite extension can also be used in any place that supports loading SQLite extensions. Here is an example of using TonboLite extension in Python: + +```py +import sqlite3 + +conn = sqlite3.connect(":memory") +conn.enable_load_extension(True) +# Load the tonbolite extension +conn.load_extension("target/release/libsqlite_tonbo.dylib") +con.enable_load_extension(False) + +conn.execute("CREATE VIRTUAL TABLE temp.tonbo USING tonbo(" + "create_sql = 'create table tonbo(id bigint primary key, name varchar, like int)', " + "path = 'db_path/tonbo'" + ")") +conn.execute("INSERT INTO tonbo (id, name, like) VALUES (0, 'lol', 1)") +conn.execute("INSERT INTO tonbo (id, name, like) VALUES (1, 'lol', 100)") +rows = conn.execute("SELECT * FROM tonbo;") +for row in rows: + print(row) +# ...... +``` diff --git a/guide/src/tonbolite/usage.md b/guide/src/tonbolite/usage.md new file mode 100644 index 00000000..536d866d --- /dev/null +++ b/guide/src/tonbolite/usage.md @@ -0,0 +1,157 @@ +# Usage + +## Using as Extension + +If you do not know how to build TonboLite, please refer to the [Building](./build.md) section. + +### Loading TonboLite Extension + +Once building successfully, you will get a file named libsqlite_tonbo.dylib(.dll on windows, .so on most other unixes) in *target/release/*(or *target/debug/*). + +SQLite provide [`.load`](https://www.sqlite.org/cli.html#loading_extensions) command to load a SQLite extension. So, you can load TonboLite extension by running the following command: + +```bash +.load target/release/libsqlite_tonbo +``` + +Or you can load TonboLite extension in Python or other languages. +```py +import sqlite3 + +conn = sqlite3.connect(":memory") +conn.enable_load_extension(True) +# Load the tonbolite extension +conn.load_extension("target/release/libsqlite_tonbo.dylib") +con.enable_load_extension(False) + +# ...... +``` + + +After loading TonboLite successfully, you can start to use it. + +### Create Table + +Unlike Normal `CREATE TABLE` statement, TonboLite use [SQLite Virtual Table](https://www.sqlite.org/vtab.html) syntax to create a table: + +```sql +CREATE VIRTUAL TABLE temp.tonbo USING tonbo( + create_sql = 'create table tonbo(id bigint primary key, name varchar, like int)', + path = 'db_path/tonbo' +); +``` + +### Select/Insert/Update/Delete + +you can execute SQL statements just like normal SQL in the SQLite. Here is an example: + +```sql +sqlite> .load target/release/libsqlite_tonbo + +sqlite> CREATE VIRTUAL TABLE temp.tonbo USING tonbo( + create_sql = 'create table tonbo(id bigint primary key, name varchar, like int)', + path = 'db_path/tonbo' +); +sqlite> insert into tonbo (id, name, like) values (0, 'tonbo', 100); +sqlite> insert into tonbo (id, name, like) values (1, 'sqlite', 200); + +sqlite> select * from tonbo; +0|tonbo|100 +1|sqlite|200 + +sqlite> update tonbo set like = 123 where id = 0; + +sqlite> select * from tonbo; +0|tonbo|123 +1|sqlite|200 + +sqlite> delete from tonbo where id = 0; + +sqlite> select * from tonbo; +1|sqlite|200 +``` + +### Flush + +TonboLite use LSM tree to store data, and it use a WAL buffer size to improve performance, so you may need to flush data to disk manually. But SQLite don't provide flush interface, so we choose to implement it in the [`pragma quick_check`](https://www.sqlite.org/pragma.html#pragma_quick_check). + +```sql +PRAGMA tonbo.quick_check; +``` + +## Using in Rust + +To use TonboLite in your application, you can import TonboLite in the *Cargo.toml* file. + +```toml +tonbolite = { git = "https://github.com/tonbo-io/tonbolite" } +``` + +You can create use TonboLite just like in [Rusqlite](https://github.com/rusqlite/rusqlite), but you should create table using [SQLite Virtual Table](https://www.sqlite.org/vtab.html) syntax: + +```rust +let _ = std::fs::create_dir_all("./db_path/test"); + +let db = rusqlite::Connection::open_in_memory()?; +crate::load_module(&db)?; + +db.execute_batch( + "CREATE VIRTUAL TABLE temp.tonbo USING tonbo( + create_sql = 'create table tonbo(id bigint primary key, name varchar, like int)', + path = 'db_path/test' + );" +).unwrap(); + +db.execute( + "INSERT INTO tonbo (id, name, like) VALUES (1, 'lol', 12)", + [], +).unwrap(); + +let mut stmt = db.prepare("SELECT * FROM tonbo;")?; +let _rows = stmt.query([])?; +``` +for more usage, you can refer to [Rusqlite](https://docs.rs/rusqlite). + +One difference is that TonboLite extends [`pragma quick_check`](https://www.sqlite.org/pragma.html#pragma_quick_check) to flush WAL to disk. You can use it like this: + +```rust +db.pragma(None, "quick_check", "tonbo", |_r| -> rusqlite::Result<()> { + Ok(()) +}).unwrap(); +``` + +## Using in JavaScript + +To use TonboLite in wasm, can should enable *wasm* feature. +```toml +tonbolite = { git = "https://github.com/tonbo-io/tonbolite", default-features = false, features = ["wasm"] } +``` +After building successfully, you will get a *pkg* folder containing compiled js and wasm files. Copy it to your project and then you can start to use it. If you don't know how to build TonboLite on wasm, you can refer to [TonboLite](build.md#build-on-wasm). + +Here is an example of how to use TonboLite in JavaScript: + +```javascript +const tonbo = await import("./pkg/sqlite_tonbo.js"); +await tonbo.default(); + +const db = new TonboLite('db_path/test'); +await db.create(`CREATE VIRTUAL TABLE temp.tonbo USING tonbo( + create_sql ='create table tonbo(id bigint primary key, name varchar, like int)', + path = 'db_path/tonbo' +);`); + +await db.insert('INSERT INTO tonbo (id, name, like) VALUES (1, \'lol\', 12)'); +await conn.delete("DELETE FROM tonbo WHERE id = 4"); +await conn.update("UPDATE tonbo SET name = 'tonbo' WHERE id = 6"); + +const rows = await db.select('SELECT * FROM tonbo limit 10;'); +console.log(rows); + +await db.flush(); +``` + +
+ +TonboLite should be used in a [secure context](https://developer.mozilla.org/en-US/docs/Web/Security/Secure_Contexts) and [cross-origin isolated](https://developer.mozilla.org/en-US/docs/Web/API/Window/crossOriginIsolated), since it uses [`SharedArrayBuffer`](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/SharedArrayBuffer) to share memory. Please refer to [this article](https://web.dev/articles/coop-coep) for a detailed explanation. + +
diff --git a/guide/src/usage/advance.md b/guide/src/usage/advance.md new file mode 100644 index 00000000..eb27faeb --- /dev/null +++ b/guide/src/usage/advance.md @@ -0,0 +1,266 @@ +# Explore Tonbo + + + +Tonbo provide `DynRecord` to support dynamic schema. We have been using it to build Python and WASM bindings for Tonbo. You can find the source code [here](https://github.com/tonbo-io/tonbo/tree/main/bindings). + +Except using it in Python and WASM bindings for Tonbo, we have also used it to build a SQLite extension, [TonboLite](https://github.com/tonbo-io/tonbolite). This means that you can do more interesting things with tonbo such as building a PostgreSQL extension and integrating with datafusio. + + +## DynRecord + +`DynRecord` is just like the schema you defined by `#[derive(Record)]`, but the fields are not known at compile time. So, before using it, you need to pass the schema and value by yourself. Here is the constructor of the `DynSchema`, the schema of `DynRecord`: + +```rust +// constructor of DynSchema +pub fn new(schema: Vec, primary_index: usize) -> DynSchema; + +// constructor of ValueDesc +pub fn new(name: String, datatype: DataType, is_nullable: bool) -> ValueDesc; +``` +- `ValueDesc`: represents a field of schema, which contains field name, field type. + - `name`: represents the name of the field. + - `datatype`: represents the data type of the field. + - `is_nullable`: represents whether the field can be nullable. +- `primary_index`: represents the index of the primary key field in the schema. + + +```rust +pub fn new(values: Vec, primary_index: usize) -> DynRecord; + +pub fn new( + datatype: DataType, + name: String, + value: Arc, + is_nullable: bool, +) -> Value; + +``` + +- `Value`: represents a field of schema and its value, which contains a field description and the value. + - `datatype`: represents the data type of the field. + - `name`: represents the name of the field. + - `is_nullable`: represents whether the field is nullable. + - `value`: represents the value of the field. +- `primary_index`: represents the index of the primary key field in the schema. + +Now, tonbo support these types for dynamic schema: + +| Tonbo type | Rust type | +| --- | --- | +| `UInt8`/`UInt16`/`UInt32`/`UInt64` | `u8`/`u16`/`u32`/`u64` | +| `Int8`/`Int16`/`Int32`/`Int64` | `i8`/`i16`/`i32`/`i64` | +| `Boolean` | `bool` | +| `String` | `String` | +| `Bytes` | `Vec` | + + +It allows you to define a schema at runtime and use it to create records. This is useful when you need to define a schema dynamically or when you need to define a schema that is not known at compile time. + +## Operations +After creating `DynSchema`, you can use tonbo just like before. The only difference is that what you insert and get is the type of `DynRecord` and `DynRecordRef`. + +If you compare the usage with compile-time schema version, you will find that the usage is almost the same. The difference can be summarized into the following 5 points. +- Use `DynSchema` to replace `xxxSchema`(e.g. `UserSchema`) +- Use `DynRecord` instance to replace the instance you defined with `#[derive(Record)]` +- All you get from database is `DynRecordRef` rather than `xxxRef`(e.g. `UserRef`) +- Use `Value` as the `Key` of `DynRecord`. For example, you should pass a `Value` instance the `DB::get` method. +- The value of `Value` should be the type of `Arc>` if the column can be nullable. + +But if you look at the code, you will find that both `DynSchema` and `xxxSchema` implement the `Schema` trait , both `DynRecord` and `xxxRecord` implement the `Record` trait and both `DynRecordRef` and `xxxRecordRef` implement the `RecordRef` trait. So there is only two difference between them + +### Create Database +```rust +#[tokio::main] +async fn main() { + // make sure the path exists + fs::create_dir_all("./db_path/users").unwrap(); + + // build DynSchema + let descs = vec![ + ValueDesc::new("name".to_string(), DataType::String, false), + ValueDesc::new("email".to_string(), DataType::String, false), + ValueDesc::new("age".to_string(), DataType::Int8, true), + ]; + let schema = DynSchema::new(descs, 0); + + let options = DbOption::new( + Path::from_filesystem_path("./db_path/users").unwrap(), + &schema, + ); + + let db = DB::::new(options, TokioExecutor::current(), DynSchema) + .await + .unwrap(); +} +``` + +If you want to learn more about `DbOption`, you can refer to the [Configuration section](conf.md). + +> **Note:** You should make sure the path exists before creating `DBOption`. + +### Insert + +You can use `db.insert(record)` or `db.insert_batch(records)` to insert new records into the database just like before. The difference is that you should build insert a `DynRecord` instance. + +Here is an example of how to build a `DynRecord` instance: + +```rust +let mut columns = vec![ + Value::new( + DataType::String, + "name".to_string(), + Arc::new("Alice".to_string()), + false, + ), + Value::new( + DataType::String, + "email".to_string(), + Arc::new("abc@tonbo.io".to_string()), + false, + ), + Value::new( + DataType::Int8, + "age".to_string(), + Arc::new(Some(i as i8)), + true, + ), +]; +let record = DynRecord::new(columns, 0); +``` +- `Value::new` will create a new `Value` instance, which represents the value of the column in the schema. This method receives three parameters: + - datatype: the data type of the field in the schema + - name: the name of the field in the schema + - value: the value of the column. This is the type of `Arc`. But please be careful that **the value should be the type of `Arc>` if the column can be nullable**. + - nullable: whether the value is nullable + +```rust +/// insert a single tonbo record +db.insert(record).await.unwrap(); +``` + +### Remove +You and use `db.remove(key)` to remove a record from the database. This method receives a `Key`, which is the primary key of the record. But all columns in the record is a `Value`, so you can not use it like `db.remove("Alice".into()).await.unwrap();`. Instead, you should pass a `Value` to `db.remove`. + +```rust +let key = Value::new( + DataType::String, + "name".to_string(), + Arc::new("Alice".to_string()), + false, +); + +db.remove(key).await.unwrap(); +``` + +### Query + +You can use `get` method to get a record by key and you should pass a closure that takes a `TransactionEntry` instance and returns a `Option` type. You can use `TransactionEntry::get` to get a `DynRecordRef` instance. + +You can use `scan` method to scan all records that in the specified range. `scan` method will return a `Stream` instance and you can iterate all records by using this stream. + +```rust +/// get the record with `key` as the primary key and process it using closure `f` +let age = db.get(key, + |entry| { + // entry.get() will get a `DynRecordRef` + let record_ref = entry.get(); + println!("{:#?}", record_ref); + record_ref.age + }) + .await + .unwrap(); + +let mut scan = db + .scan((Bound::Included(&lower_key), Bound::Excluded(&upper_key))) + .await + .unwrap(); +while let Some(entry) = scan.next().await.transpose().unwrap() { + let data = entry.value(); // type of DynRecordRef + // ...... +} +``` + +### Transaction +Tonbo supports transactions when using a `Transaction`. You can use `db.transaction()` to create a transaction, and use `txn.commit()` to commit the transaction. + +Note that Tonbo provides optimistic concurrency control to ensure data consistency which means that if a transaction conflicts with another transaction when committing, Tonbo will fail with a `CommitError`. + +Here is an example of how to use transactions: +```rust +// create transaction +let txn = db.transaction().await; + +let name = Value::new( + DataType::String, + "name".to_string(), + Arc::new("Alice".to_string()), + false, +); +let upper = Value::new( + DataType::String, + "name".to_string(), + Arc::new("Bob".to_string()), + false, +); + +txn.insert(DynRecord::new(/* */)); +let _record_ref = txn.get(&name, Projection::Parts(vec!["email", "bytes"])).await.unwrap(); + +// range scan of user +let mut scan = txn + .scan((Bound::Included(&name), Bound::Excluded(&upper))) + // tonbo supports pushing down projection + .projection(&["email", "bytes"]) + // push down limitation + .limit(1) + .take() + .await + .unwrap(); + +while let Some(entry) = scan.next().await.transpose().unwrap() { + let data = entry.value(); // type of DynRecordRef + // ...... +} +``` + +For more detail about transactions, please refer to the [Transactions](../transactions.md) section. + +## Using S3 backends + +Using S3 as the backend storage is also similar to the usage of [compile-time version](./tonbo.md#using-s3-backends). + +```rust +use tonbo::option::{ AwsCredential, FsOptions, Path }; +use tonbo::{executor::tokio::TokioExecutor, DbOption, DB}; + +#[tokio::main] +async fn main() { + let fs_option = FsOptions::S3 { + bucket: "wasm-data".to_string(), + credential: Some(AwsCredential { + key_id: "key_id".to_string(), + secret_key: "secret_key".to_string(), + token: None, + }), + endpoint: None, + sign_payload: None, + checksum: None, + region: Some("region".to_string()), + }; + + let descs = vec![ + ValueDesc::new("name".to_string(), DataType::String, false), + ValueDesc::new("email".to_string(), DataType::String, false), + ValueDesc::new("age".to_string(), DataType::Int8, true), + ]; + let schema = DynSchema::new(descs, 0); + let options = DbOption::new(Path::from_filesystem_path("s3_path").unwrap(), &schema) + .level_path(2, "l2", fs_option); + + + let db = DB::::new(options, TokioExecutor::current(), schema) + .await + .unwrap(); +} +``` diff --git a/guide/src/usage/conf.md b/guide/src/usage/conf.md new file mode 100644 index 00000000..3cfa385d --- /dev/null +++ b/guide/src/usage/conf.md @@ -0,0 +1,143 @@ +# Configuration + + + +Tonbo provides a configuration struct `DbOption` for setting up the database. This section will introduce the configuration options available in Tonbo. + +## Path Configuration + +Tonbo will use local disk as the default storage option(For local is the tokio file, for wasm is the OPFS). If you want to change the default storage backends `DbOption::base_path`. + +```rust +pub fn base_fs(mut self, base_fs: FsOptions) -> DbOption; +``` + +`FsOptions` is the configuration options for the file system. Tonbo provides two kinds of file system options: `FsOptions::Local` and `FsOptions::S3`. +- `FsOptions::Local`: This is required the feature `tokio`/`wasm` to be enabled. +- `FsOptions::S3{...}`: This is required the feature `aws` and `tokio-http`/`wasm-http` to be enabled. You can use this `FsOptions` to configure the S3 storage. + +```rust +pub enum FsOptions { + #[cfg(any(feature = "tokio", feature = "wasm"))] + Local, + #[cfg(feature = "aws")] + S3 { + bucket: String, + credential: Option, + endpoint: Option, + region: Option, + sign_payload: Option, + checksum: Option, + }, +} + +#[derive(Debug, Clone)] +pub struct AwsCredential { + /// AWS_ACCESS_KEY_ID + pub key_id: String, + /// AWS_SECRET_ACCESS_KEY + pub secret_key: String, + /// AWS_SESSION_TOKEN + pub token: Option, +} +``` +- `bucket`: The S3 bucket +- `credential`: The credential configuration for S3 + - `key_id`: The S3 access key + - `secret_key`: The S3 secret access key + - `token`: is the security token for the aws S3 +- `endpoint`: The S3 endpoint +- `region`: The S3 region +- `sign_payload`: Whether to sign payload for the aws S3 +- `checksum`: Whether to enable checksum for the aws S3 + + +If you want to set specific storage options for SSTables, you can use `DbOption::level_path`. This method allows you to specify the storage options for each level of SSTables. If you don't specify the storage options for a level, Tonbo will use the default storage options(that is base fs). + +```rust +pub fn level_path( + mut self, + level: usize, + path: Path, + fs_options: FsOptions, +) -> Result; +``` + +## Manifest Configuration + +Manifest is used to store the metadata of the database. Whenever the compaction is triggered, the manifest will be updated accordingly. But when time goes by, the manifest file will become large, which will increase the time of recovery. Tonbo will rewrite the manifest file if metadata too much, you can use `DbOption::version_log_snapshot_threshold` to configure + +```rust +pub fn version_log_snapshot_threshold(self, version_log_snapshot_threshold: u32) -> DbOption; +``` + +If you want to persist metadata files to S3, you can configure `DbOption::base_fs` with `FsOptions::S3{...}`. This will enable Tonbo to upload metadata files and WAL files to the specified S3 bucket. + +> **Note**: This will not guarantee the latest metadata will be uploaded to S3. If you want to ensure the latest metadata is uploaded, you can use `DB::flush` to trigger upload manually. If you want tonbo to trigger upload more frequently, you can adjust `DbOption::version_log_snapshot_threshold` to a smaller value. The default value is 200. + +## WAL Configuration + +Tonbo use WAL(Write-ahead log) to ensure data durability and consistency. It is a mechanism that ensures that data is written to the log before being written to the database. This helps to prevent data loss in case of a system failure. + +Tonbo also provides a buffer to improve performance. If you want to flush wal buffer, you can call `DbOption::flush_wal`. The default buffer size is 4KB. But If you don't want to use wal buffer, you can set the buffer to 0. + +```rust +pub fn wal_buffer_size(self, wal_buffer_size: usize) -> DbOption; +``` + +If you don't want to use WAL, you can disable it by setting the `DbOption::disable_wal`. But please ensure that losing data is acceptable for you. + +```rust +pub fn disable_wal(self) -> DbOption; +``` + +## Compaction Configuration + +When memtable reaches the maximum size, we will turn it into a immutable which is read only memtable. But when the number of immutable table reaches the maximum size, we will compact them to SSTables. You can set the `DbOption::immutable_chunk_num` to control the number of files for compaction. +```rust +/// len threshold of `immutables` when minor compaction is triggered +pub fn immutable_chunk_num(self, immutable_chunk_num: usize) -> DbOption; +``` + +When the number of files in level L exceeds its limit, we also compact them in a background thread. Tonbo use the `major_threshold_with_sst_size` and `level_sst_magnification` to determine when to trigger major compaction. The calculation is as follows: + +\\[ major\\_threshold\\_with\\_sst\\_size * level\\_sst\\_magnification^{level} \\] + +`major_threshold_with_sst_size` is default to 4 and `level_sst_magnification` is default to 10, which means that the default trigger threshold for level1 is 40 files and 400 for level2. + +You can adjust the `major_threshold_with_sst_size` and `level_sst_magnification` to control the compaction behavior. + +```rust +/// threshold for the number of `parquet` when major compaction is triggered +pub fn major_threshold_with_sst_size(self, major_threshold_with_sst_size: usize) -> DbOption + +/// magnification that triggers major compaction between different levels +pub fn level_sst_magnification(self, level_sst_magnification: usize) -> DbOption; +``` + +You can also change the default SSTable size by setting the `DbOption::max_sst_file_size`, but we found that the default size is good enough for most use cases. +```rust +/// Maximum size of each parquet +pub fn max_sst_file_size(self, max_sst_file_size: usize) -> DbOption +``` + +## SSTable Configuration + +Tonbo use [parquet](https://github.com/apache/parquet-rs) to store data which means you can set `WriterProperties` for parquet file. You can use `DbOption::write_parquet_option` to set specific settings for Parquet. + +```rust +/// specific settings for Parquet +pub fn write_parquet_option(self, write_parquet_properties: WriterProperties) -> DbOption +``` + +Here is an example of how to use `DbOption::write_parquet_option`: + +```rust +let db_option = DbOption::default().write_parquet_option( + WriterProperties::builder() + .set_compression(Compression::LZ4) + .set_statistics_enabled(EnabledStatistics::Chunk) + .set_bloom_filter_enabled(true) + .build(), +); +``` diff --git a/guide/src/usage/faq.md b/guide/src/usage/faq.md new file mode 100644 index 00000000..a222e9a0 --- /dev/null +++ b/guide/src/usage/faq.md @@ -0,0 +1,23 @@ +# FAQ + +## Failed to run custom build command for `ring` in macOS +Apple Clang is a fork of Clang that is specialized to Apple's wishes. It doesn't support wasm32-unknown-unknown. You need to download and use llvm.org Clang instead. You can refer to this [issue](https://github.com/briansmith/ring/issues/1824) for more information. + +```bash +brew install llvm +echo 'export PATH="/opt/homebrew/opt/llvm/bin:$PATH"' >> ~/.zshrc +``` + +## Why my data is not recovered and the size of log file and WAL file is 0? + +As Tonbo uses buffer for WAL, so it may not be persisted before exiting. You can use `DB::flush_wal` to ensure WAL is persisted or use `DB::flush` to trigger compaction manually. + +If you don't want to use WAL buffer, you can set `DbOption::wal_buffer_size` to 0. See more details in [Configuration](./conf.md#wal-configuration). + +## How to persist metadata files to S3? / Why metadata files are not persisted in serverless environment like AWS Lambda + +If you want to persist metadata files to S3, you can configure `DbOption::base_fs` with `FsOptions::S3{...}`. This will enable Tonbo to upload metadata files and WAL files to the specified S3 bucket. + +> **Note**: This will not guarantee the latest metadata will be uploaded to S3. If you want to ensure the latest WAL is uploaded, you can use `DB::flush_wal`. If you want to ensure the latest metadata is uploaded, you can use `DB::flush` to trigger upload manually. If you want tonbo to trigger upload more frequently, you can adjust `DbOption::version_log_snapshot_threshold` to a smaller value. The default value is 200. + +See more details in [Configuration](./conf.md#manifest-configuration). diff --git a/guide/src/usage/index.md b/guide/src/usage/index.md new file mode 100644 index 00000000..e69de29b diff --git a/guide/src/usage/python.md b/guide/src/usage/python.md new file mode 100644 index 00000000..7f3300c1 --- /dev/null +++ b/guide/src/usage/python.md @@ -0,0 +1,56 @@ +# Tonbo Python Binding + + +## `@Record` + +Tonbo provides ORM-like macro for ease of use, you can use `@Record` to define schema of column family. +```py +@Record +class User: + id = Column(DataType.Int64, name="id", primary_key=True) + age = Column(DataType.Int16, name="age", nullable=True) + name = Column(DataType.String, name="name", nullable=False) +``` + +
+ +This is a bad thing that you should pay attention to. + +Warning blocks should be used sparingly in documentation, to avoid "warning +fatigue," where people are trained to ignore them because they usually don't +matter for what they're doing. + +
+ + +## Configuration + +## Example + +```python +from tonbo import DbOption, Column, DataType, Record, TonboDB, Bound +from tonbo.fs import from_filesystem_path +import asyncio + +@Record +class User: + id = Column(DataType.Int64, name="id", primary_key=True) + age = Column(DataType.Int16, name="age", nullable=True) + name = Column(DataType.String, name="name", nullable=False) + +async def main(): + db = TonboDB(DbOption(from_filesystem_path("db_path/user")), User()) + await db.insert(User(id=18, age=175, name="Alice")) + record = await db.get(18) + print(record) + + # use transcaction + txn = await db.transaction() + result = await txn.get(18) + scan = await txn.scan(Bound.Included(18), None, limit=10, projection=["id", "name"]) + + async for record in scan: + print(record) + +asyncio.run(main()) +```` diff --git a/guide/src/usage/tonbo.md b/guide/src/usage/tonbo.md new file mode 100644 index 00000000..d3ad5338 --- /dev/null +++ b/guide/src/usage/tonbo.md @@ -0,0 +1,275 @@ +# Tonbo API + + + +## Schema + +Tonbo provides ORM-like macro for ease of use, you can use `Record` macro to define schema of column family. Tonbo will generate all relevant code for you at compile time. For example, if you have a struct below + +```rust +use tonbo::Record; + +#[derive(Record, Debug)] +pub struct User { + #[record(primary_key)] + name: String, + email: Option, + age: u8, +} +``` + +tonbo will generate a struct `UserSchema` where you can get schema from. Other than `UserSchema`, tonbo will also generate a `UserRef` struct. You should notice that the records you get from tonbo are `UserRef` and all fields except primary key are `Option`. + +```rust +#[derive(Debug, PartialEq, Eq, Clone, Copy)] +pub struct UserRef<'r> { + pub name: &'r str, + pub email: Option<&'r str>, + pub age: Option, +} +``` + + +## Operations +### Create Database + +You can use `DB::new(DbOption, Schema)` to create a database. `DbOption` is the configuration options for the database and `Schema` is the `xxxSchema` that tonbo generated. + + +> **Note:** If you use tonbo in WASM, you should use `Path::from_opfs_path` rather than `Path::from_filesystem_path`. +> +```rust +use std::fs; +use fusio::path::Path; +use tonbo::{executor::tokio::TokioExecutor, DbOption, DB}; + +#[tokio::main] +async fn main() { + // make sure the path exists + fs::create_dir_all("./db_path/users").unwrap(); + + let options = DbOption::new( + Path::from_filesystem_path("./db_path/users").unwrap(), + &UserSchema, + ); + let db = DB::::new(options, TokioExecutor::current(), UserSchema) + .await + .unwrap(); +} +``` + +#### DbOption +`DbOption` is a struct that contains configuration options for the database. Here are some configuration options you can set: + +```rust +// Creates a new `DbOption` instance with the given path and schema. +// The path is the default path that the database will use. +async fn new(option: DbOption, executor: E, schema: R::Schema) -> Result>; + +// Sets the path of the database. +fn path(self, path: impl Into) -> Self; + +/// disable the write-ahead log. This may risk of data loss during downtime +pub fn disable_wal(self) -> Self; + +/// Maximum size of WAL buffer, default value is 4KB +/// If set to 0, the WAL buffer will be disabled. +pub fn wal_buffer_size(self, wal_buffer_size: usize) -> Self; +``` + +If you want to learn more about `DbOption`, you can refer to the [Configuration section](conf.md). + +> **Note:** You should make sure the path exists before creating `DBOption`. + + +#### Executor + +Tonbo provides an `Executor` trait that you can implement to execute asynchronous tasks. Tonbo has implemented `TokioExecutor`(for local disk) and `OpfsExecutor`(for WASM) for users. You can also customize yourself Executor, here is an example implementation of the `Executor` trait: + +```rust +pub struct TokioExecutor { + handle: Handle, +} + +impl TokioExecutor { + pub fn current() -> Self { + Self { + handle: Handle::current(), + } + } +} + +impl Executor for TokioExecutor { + fn spawn(&self, future: F) + where + F: Future + MaybeSend + 'static, + { + self.handle.spawn(future); + } +} +``` + +### Query + +You can use `get` method to get a record by key and you should pass a closure that takes a `TransactionEntry` instance and returns a `Option` type. You can use `TransactionEntry::get` to get a `UserRef` instance. This `UserRef` instance is a struct that tonbo generates for you. All fields except primary key are `Option` type, because you may not have set them when you create the record. + +You can use `scan` method to scan all records that in the specified range. `scan` method will return a `Stream` instance and you can iterate all records by using this stream. + +```rust +/// get the record with `key` as the primary key and process it using closure `f` +let age = db.get(&"Alice".into(), + |entry| { + // entry.get() will get a `UserRef` + let user = entry.get(); + println!("{:#?}", user); + user.age + }) + .await + .unwrap(); + +let mut scan = db + .scan((Bound::Included(&name), Bound::Excluded(&upper))) + .await + .unwrap(); +while let Some(entry) = scan.next().await.transpose().unwrap() { + let data = entry.value(); // type of UserRef + // ...... +} +``` +### Insert/Remove + +You can use `db.insert(record)` or `db.insert_batch(records)` to insert new records into the database and use `db.remove(key)` to remove a record from the database. Here is an example of updating the state of database: +```rust +let user = User { + name: "Alice".into(), + email: Some("alice@gmail.com".into()), + age: 22, + bytes: Bytes::from(vec![0, 1, 2]), +}; + +/// insert a single tonbo record +db.insert(user).await.unwrap(); + +/// insert a sequence of data as a single batch +db.insert_batch("Alice".into()).await.unwrap(); + +/// remove the specified record from the database +db.remove("Alice".into()).await.unwrap(); +``` +### Transaction +Tonbo supports transactions when using a `Transaction`. You can use `db.transaction()` to create a transaction, and use `txn.commit()` to commit the transaction. + +Note that Tonbo provides optimistic concurrency control to ensure data consistency which means that if a transaction conflicts with another transaction when committing, Tonbo will fail with a `CommitError`. + +Here is an example of how to use transactions: +```rust +// create transaction +let txn = db.transaction().await; + +let name = "Alice".into(); + +txn.insert(User { /* ... */ }); +let _user = txn.get(&name, Projection::Parts(vec!["email", "bytes"])).await.unwrap(); + +let upper = "Blob".into(); +// range scan of user +let mut scan = txn + .scan((Bound::Included(&name), Bound::Excluded(&upper))) + // tonbo supports pushing down projection + .projection(&["email", "bytes"]) + // push down limitation + .limit(1) + .take() + .await + .unwrap(); + +while let Some(entry) = scan.next().await.transpose().unwrap() { + let data = entry.value(); // type of UserRef + // ...... +} +``` +#### Query +Transactions support easily reading the state of keys that are currently batched in a given transaction but not yet committed. + +You can use `get` method to get a record by key, and `get` method will return a `UserRef` instance. This `UserRef` instance is a struct that tonbo generates for you in the compile time. All fields except primary key are `Option` type, because you may not have set them when you create the record. You can also pass a `Projection` to specify which fields you want to get. `Projection::All` will get all fields, `Projection::Parts(Vec<&str>)` will get only primary key, `email` and `bytes` fields(other fields will be `None`). + +You can use `scan` method to scan all records that in the specified range. `scan` method will return a `Scan` instance. You can use `take` method to get a `Stream` instance and iterate all records that satisfied. Tonbo also supports pushing down filters and projections. You can use `Scan::projection(vec!["id", "email"])` to specify which fields you want to get and use `Scan::limit(10)` to limit the number of records you want to get. + +```rust +let txn = db.transaction().await; + +let _user = txn.get(&name, Projection::Parts(vec!["email"])).await.unwrap(); + +let mut scan_stream = txn + .scan((Bound::Included(&name), Bound::Excluded(&upper))) + // tonbo supports pushing down projection + .projection(&["email", "bytes"]) + // push down limitation + .limit(10) + .take() + .await + .unwrap(); +while let Some(entry) = scan_stream.next().await.transpose().unwrap() { + let data = entry.value(); // type of UserRef + // ...... +} +``` + +#### Insert/Remove +You can use `txn.insert(record)` to insert a new record into the database and use `txn.remove(key)` to remove a record from the database. Tonbo will use a B-Tree to store all data that you modified(insert/remove). All your modifications will be committed to the database when only you call `txn.commit()` successfully. If conflict happens, Tonbo will return an error and all your modifications will be rollback. + +Here is an example of how to use transaction to update the state of database: + +```rust + +let mut txn = db.transaction().await; +txn.insert(User { + id: 10, + name: "John".to_string(), + email: Some("john@example.com".to_string()), +}); +txn.remove("Alice".into()); +txn.commit().await.unwrap(); +``` + + + +After create `DB`, you can execute `insert`, `remove`, `get` and other operations now. But remember that you will get a **`UserRef` instance** rather than the `User`, if you get record from tonbo. This is a struct that tonbo generates for you in the compile time. It may look like: + +## Using S3 backends + +Tonbo supports various storage backends, such as OPFS, S3, and maybe more in the future. Tonbo wiil use local storage by default. If you want to use S3 storage for specific level, you can use `DbOption::level_path(FsOptions::S3)` so that all files in that level will be pushed to S3. + +```rust +use tonbo::option::{ AwsCredential, FsOptions, Path }; +use tonbo::{executor::tokio::TokioExecutor, DbOption, DB}; + +#[tokio::main] +async fn main() { + let fs_option = FsOptions::S3 { + bucket: "wasm-data".to_string(), + credential: Some(AwsCredential { + key_id: "key_id".to_string(), + secret_key: "secret_key".to_string(), + token: None, + }), + endpoint: None, + sign_payload: None, + checksum: None, + region: Some("region".to_string()), + }; + + let options = DbOption::new(Path::from_filesystem_path("s3_path").unwrap(), &UserSchema) + .level_path(2, "l2", fs_option); + + let db = DB::::new(options, TokioExecutor::current(), UserSchema) + .await + .unwrap(); +} +``` + +If you want to persist metadata files to S3, you can configure `DbOption::base_fs` with `FsOptions::S3{...}`. This will enable Tonbo to upload metadata files and WAL files to the specified S3 bucket. + +> **Note**: This will not guarantee the latest metadata will be uploaded to S3. If you want to ensure the latest WAL is uploaded, you can use `DB::flush_wal`. If you want to ensure the latest metadata is uploaded, you can use `DB::flush` to trigger upload manually. If you want tonbo to trigger upload more frequently, you can adjust `DbOption::version_log_snapshot_threshold` to a smaller value. The default value is 200. + +See more details in [Configuration](./conf.md#manifest-configuration). diff --git a/guide/src/usage/wasm.md b/guide/src/usage/wasm.md new file mode 100644 index 00000000..e69de29b