Description
Feature description
The goal would be to add the possibility to paginate queries, with a defined page size with a logic like this:
- Get query and make sure there's a key to sort by
- Modify the query to set the page size by adding a LIMIT
- Execute the query and retrieve results
- IF rows returned < page size, add OFFSET of page size and repeat previous step
This could also be updated for parallel executions, simply by executing a count and then dividing the load among the other workers. As an improvement, the pages would overlap to avoid skipping any rows by mistake, then the result can flow through the pipeline.
Are you a dlt user?
Yes, I'm already a dlt user.
Use case
When using sql_table and sql_database with databases that have limitations on the amount of rows retrieved per transaction, dlt does not currently offer the possibility of paginating through the query with pagination (LIMIT & OFFSET). There is the possibility to make changes to the DB, but due to technical restrictions from the production environment this might not be feasible.
Proposed solution
Add the possibility to paginate SQL query imports to bypass these restrictions without requiring changes on the source database.
Related issues
No response
Metadata
Metadata
Assignees
Type
Projects
Status