-
Notifications
You must be signed in to change notification settings - Fork 79
feat: Add centralized manifest storage using relational database - part1 #481
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Codecov Report❌ Patch coverage is
📢 Thoughts on this report? Let us know! |
scope_id INTEGER REFERENCES scopes(id) ON DELETE RESTRICT, | ||
PRIMARY KEY (version_snapshot_id, row_index, col_index), | ||
|
||
CONSTRAINT valid_row_index CHECK (row_index >= 0 AND row_index <= 7), -- harded coded MAX_LEVEL |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the max level num is 6 I think, we hard code leveled compaction as 7 levels, which are level0 to level6, so I think row_index < 7
would be correct
r#" | ||
SELECT id, timestamp, log_length | ||
FROM version_snapshots | ||
ORDER BY created_at DESC |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the ordering column is different with https://github.com/tonbo-io/tonbo/pull/481/files#diff-f2ac089b8c9efebe62753973e425342ec1abbb4a4dee365daa6babca6e8a35deR140, it would be clear if we always use the same column to get/set the current timestamp
// Non recovery mode will persist the log length | ||
if !is_recover { | ||
version_edits.push(VersionEdit::NewLogLength { len: edit_len }); | ||
eprint!("I'm here to push: {edit_len}") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
return Error
here would be better
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
opps. It was for my testing but apparently I missed this. Will fix
sqlx does not support WASM target platform, which means we can not support running Tonbo cloud instance on FaaS based on Deno (it would be better to support this), are there any alternatives? |
IMO What do you think? cc @jonathanc-n @KafkaCat |
Yes, this also relates to GC in major compaction. |
I think remote metadata service implmentation would be better to be an external independent crate (maybe a workspace member in this repository), not inline in Tonbo core, it keeps Tonbo core lean, and lets the backend evolve independently. |
option: Arc<DbOption>, | ||
) -> Result<Self, MetadataServiceError> { | ||
let pool = PgPoolOptions::new() | ||
.idle_timeout(Some(Duration::from_secs(10 * 60))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be better to add a todo comment here: make it configurable would be better.
[const { Vec::new() }; MAX_LEVEL], | ||
option.clone(), | ||
clean_sender.clone(), | ||
Arc::new(AtomicU32::new(0)), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Arc::new(AtomicU32::new(0)), | |
timestamp, |
I can make metadata service into a separate workspace for this PR. But I strongly believe it should belongs to a separate repository to maintain boundary between managed service and core edge use case |
Yeah I think the existing interface is not abstract enough. The interface in my mind is: 1. All implementation should keep an in-memory representation of VersionRef and call update to flush the version edits into persistence. Rewrite is used to compact past set of version edits into a cleaner version (relational model need to revised to only record the version edits delta instead of keeping writing new version every time) |
Good question! Sqlx is a widely recognized crate for data access layer so I just use it. I haven't thought of WASM compatibility. Let's talk offline on usecase of supporting javascript runtime. I think there is some knowledge gap for me to fill.
|
Synced with ethe@ offline. Notes:
|
Summary
This PR will be the first part of a larger effort to add the centralized manifest metadata service for tonbo. More PR is coming along with way.
What changes are included in this PR?
bootstrap.sql
to model version snapshot. DDL script is idempotentcurrent()
andupdate()
forManifestStorage<R>
trait that read the latest version snapshot and write a new version snapshot by applying the version editsConcerns
VersionEdit::recover()
which is specific to reading from version log file. Current data model doesn't support this usecase. I will need to talk to someone to find out whether/how to support it.TODOs
rewrite()
anddestroy()
interfaceAre these changes tested?
Run
cargo test -- metadata::mds::tests --ignored require-docker
NOTE: The test is actually integration test. We need a better story for running integration test