-
Notifications
You must be signed in to change notification settings - Fork 84
Fix serverless compatibility by implementing conditional DataFrame caching #2218
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix serverless compatibility by implementing conditional DataFrame caching #2218
Conversation
Use Unity Catalog volumes instead of .cache() for serverless. Auto-detects compute type. Fixes: [NOT_SUPPORTED_WITH_SERVERLESS]
|
✅ 131/131 passed, 7 flaky, 5 skipped, 9m29s total Flaky tests:
Running from acceptance #3702 |
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #2218 +/- ##
==========================================
- Coverage 67.58% 67.54% -0.04%
==========================================
Files 99 99
Lines 8825 8835 +10
Branches 915 916 +1
==========================================
+ Hits 5964 5968 +4
- Misses 2686 2692 +6
Partials 175 175 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
…on-serverless-compute
Use specific exception types instead of broad Exception catch to satisfy CI linter rules. Add cluster ID check for improved detection.
Avoids CONFIG_NOT_AVAILABLE exceptions by fetching all configs at once. Passes all linter checks.
distinguishes serverless (CONFIG_NOT_AVAILABLE) from classic clusters.
…on-serverless-compute
…on-serverless-compute
…ss-compute' of github.com:databrickslabs/lakebridge into 1438-feature-remorph-reconcile-fails-to-run-on-serverless-compute
…on-serverless-compute
m-abulazm
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
|
|
||
| @cached_property | ||
| def is_cache_supported(self) -> bool: | ||
| try: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
checking internally if there is a simpler and safer way to check for this. will get back soon.
…on-serverless-compute
…on-serverless-compute
sundarshankar89
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, I don't want to hold this hostage but I would like followup documentation udpate.
That the uc volume should have the same access controls applied to as of data (uc schema where data resides) or should have locked down only to have read and rite access for the app or job user.
cc: @m-abulazm and @BesikiML
🐛 Problem
Reconciliation fails on Databricks serverless compute with:
[NOT_SUPPORTED_WITH_SERVERLESS] PERSIST TABLE is not supported on serverless compute
✅ Solution
Implemented capability-based caching detection that tests if caching actually works:
Changes Made
is_serverlessproperty - Tests caching by attempting to cache an empty DataFrameis_serverlessisFalseFixed issue: #1438
Tests