-
Notifications
You must be signed in to change notification settings - Fork 79
feat: Add Tonbo Cloud + gRPC + Flight #413
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Codecov Report❌ Patch coverage is
📢 Thoughts on this report? Let us know! |
@ethe Are you able to take a look at this today? I'll see if I can fix the rest of the code tomorrow. I would just like to see if this is what we would like so far for the postgresql read structure. |
Thank you @jonathanc-n , I'll take a look |
} | ||
|
||
#[tonic::async_trait] | ||
impl FlightService for TonboFlightSvc { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think Arrow Flight IPC make things simple at this stage, we can move forward on relying Arrow Flight.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shall we introduce Flight SQL in Tonbo Cloud? Arrow Flight can not support clauses pushdown, which is really important to our core intent.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let me take a look into fligtht sql
|
||
/// Every table has its own tonbo cloud instance. | ||
#[allow(dead_code)] | ||
pub struct AWSTonbo { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need to think about how to support multi Tonbo instances at first:
- Routing: Add db to Flight Ticket and gRPC requests; route via a DbRegistry<DbId, DB<DynRecord, TokioExecutor>>.
- Isolation: Separate S3/local paths per DB; enforce per‑DB semaphores and global caps to prevent noisy neighbors.
- Lifecycle: Lazy‑open on first access; provide idle eviction and clean shutdown hooks.
DB‑Instance LRU
- Policy: Track last_access, active_count, mem_footprint, reopen_cost; choose victims by LRU score adjusted by footprint and reopen cost; skip pinned DBs.
- Eviction: Quiesce → drain (active_count==0 or timeout) → flush WAL/minor‑flush → close handles → remove from registry.
- Soft‑evict tier: Optionally drop heavy resources (FDs/compactor) but keep lightweight metadata to speed reopen.
- Knobs: max_instances, idle_ttl, min_resident, eviction_batch_size, per‑DB concurrency limits.
Cold Start
- Checkpoints: Persist manifest_version, latest_ts, active_wal_ids; on reopen, replay only WALs newer than checkpoint.
- Bounded WAL: Small segments (32–64MB) + time‑based rotation (10–30s) to cap replay time.
- Fast serve: Load manifest/SSTs immediately for reads; parallel WAL replay; gate writes until replay completes (or route to a fresh WAL/memtable).
// Local Tonbo instance | ||
tonbo: DB<DynRecord, TokioExecutor>, | ||
// Remote file system | ||
s3_fs: Arc<dyn DynFs>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need to determine which I/O offload to S3, it might not the duty of this crate, we have to make the strategy clear in Tonbo
s3_fs: Arc<dyn DynFs>, | ||
// Endpoint for read requests (scans) | ||
endpoint: String, | ||
buffered_data: Option<RecordBatch>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you think it would be better to build a unified Parquet data block LRU cache under Tonbo instance to replace this?
|
||
impl AWSTonbo { | ||
#[cfg(test)] | ||
pub fn tonbo(&self) -> &DB<DynRecord, TokioExecutor> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TokioExecutor
is annoying, I think maybe we could use default type to omit it.
|
||
// todo: Separate the data | ||
// Returns number of rows and row size | ||
async fn parquet_metadata<'a>( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is not the ideal implementation of estimator, I think we need to rethink the design, find a way that never really scan data to compute them.
Row size is derived from the first non-empty Arrow batch’s memory footprint, this is biased (projection-dependent, variable-length columns, one-row projection batches). Also It measures Arrow in-memory usage, not Parquet on-disk size; “ParquetMetadata” naming is misleading.
}; | ||
|
||
// Converts an iterator over `DynRecord` into record batch | ||
pub fn records_to_record_batch<I>(schema: &DynSchema, rows: I) -> RecordBatch |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think Tonbo should support it natively.
@@ -89,9 +89,15 @@ pub trait Record: 'static + Sized + Decode + Debug + Send + Sync { | |||
self.as_record_ref().key() | |||
} | |||
|
|||
/// Returns owned value of record (self) | |||
fn as_owned_value(&self) -> Self; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this cloning semantic?
/// Returns a reference to the record. | ||
fn as_record_ref(&self) -> Self::Ref<'_>; | ||
|
||
/// Applies projection mask | ||
fn projection(&mut self, projection_mask: &ProjectionMask); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
unifying projection/limit/filter behind a small logical expression is a good idea instead.
Which issue does this PR close?
Use for discussion for how TonboCloud will work