python-v1.0.0: Zero to One
·
19 commits
to main
since this release
It only took us 5 years, but we made it! You can find the upgrade guide here.
Performance improvements
- refactor: async writer + multi-part by @ion-elgreco in #3255
- perf: use lazy sync reader by @ion-elgreco in #3338
New features
- feat: remove optimize operations when building without Apache Datafusion by @rtyler in #3290
- feat(api): add rustls and native-tls features by @zeevm in #3335
- feat!: update storage configuration system by @roeap in #3383
- feat: derive macro for config implementations by @roeap in #3389
- feat: upgrade to DataFusion 47.0.0 by @alamb in #3378
- feat: introduce VacuumMode::Full for cleaning up orphaned files by @rtyler in #3368
- feat: during LakeFS file operations, skip merge when 0 changes by @smeyerre in #3346
- feat: added a check for gc code to run by @JustinRush80 in #3419
- feat: spawn io with spawn service by @ion-elgreco in #3426
- feat: optimize datafusion predicate pushdown and partition pruning by @rtyler in #3436
- feat: expose kernel Engine on LogStore by @roeap in #3446
- refactor: remove pyarrow dependency by @ion-elgreco in #3459
- feat: write checkpoints with kernel by @roeap in #3466
- feat: add a table description and name to the Delta Table from Python by @fvaleye in #3464
- refactor!: remove and deprecate some python methods by @roeap in #3488
Bug Fixes
- fix: use field physical name when resolving partition columns by @zeevm in #3349
- fix(pandas): retain pyarrow decimal datatype in to_pandas() by adding types_mapper to prevent precision loss by @Abhishek1005 in #3296
- fix: prevent panics when peek_next_commit() encounters invalid data by @rtyler in #3308
- fix: serialize empty deletionVector in add actions as absent by @rtyler in #3309
- fix: stats column binary_column has unsupported type binary by @omkar-foss in #3146
- fix: check for all known valid delta files in is_deltatable by @umartin in #3318
- fix: block_in_place to allow nested tasks by @ion-elgreco in #3324
- fix: parse snapshot by @ion-elgreco in #3355
- fix: added restored metadata as action to the next committed version by @Nordalf in #3303
- fix: parse unconventional logs by @roeap in #3373
- fix: clippy warnings by @alamb in #3390
- fix: the default target size should be 100MB by @HiromuHota in #3404
- fix: if field contains space in constraint expression, checks will fail by @Nordalf in #3374
- fix: build Unity Catalog crate without DataFusion by @linhr in #3420
- fix: drop column update by @ion-elgreco in #3416
- fix: ignore temp log entries by @corwinjoy in #3423
- fix: use more accurate log path parsing by @roeap in #3461
- fix: correct spelling errors found by CI spell checker by @fvaleye in #3465
- fix: schema conversion, add conversion test cases by @ion-elgreco in #3468
- fix: set casting safe param to False by @ion-elgreco in #3481
- fix: ensure projecting only columns that exist in new files afte sche… by @alexwilcoxson-rel in #3487
Other Changes
- refactor: drop pyarrow support, restructure python modules by @ion-elgreco in #3285
- chore: bump python version for release by @rtyler in #3291
- chore: use flags for apple arm64 by @ion-elgreco in #3213
- chore: upgrade the kernel version and bump our majorish versions too by @rtyler in #3289
- chore: upgrade to DataFusion 46.0.0 by @alamb in #3261
- refactor: add 'cloud' feature to 'core' to enable 'cloud' on 'object_store' only when needed by @zeevm in #3332
- docs: update dataFusion integration example by @riziles in #3343
- refactor(python): improve typing, linting by @ion-elgreco in #3344
- chore: remove pyarrow upper by @ion-elgreco in #3325
- chore: improve io error msg by @ion-elgreco in #3328
- docs: update merge-tables.md with "Optimizing Merge Performance" section by @ldacey in #3351
- docs: add example how to authenticate using Azure CLI for Azure ADSL integration by @DanielBertocci in #3357
- chore: remove cdf feature by @ion-elgreco in #3365
- fix: correct Python docs for incremental compaction on OPTIMIZE by @roykim98 in #3301
- chore: fix some minor build warnings by @rtyler in #3366
- refactor: move transaction module to kernel by @roeap in #3380
- chore: clippy by @roeap in #3379
- chore: move proofs into dedicated folder by @roeap in #3381
- refactor!: move storage module into logstore by @roeap in #3382
- chore: put a couple symbols behind the right feature gate by @rtyler in #3393
- chore: update delta_kernel to 0.10.0 by @zachschuermann in #3403
- refactor: make "cloud" feature in object_store optional by @zeevm in #3398
- chore: bump versions of rust crates for another release party by @rtyler in #3406
- chore: commit the contents of the 0.26.0 release by @rtyler in #3408
- chore: reduce scope of feature flags and compilation requirements for subcrates by @rtyler in #3409
- chore(deps): update sqlparser requirement from 0.53.0 to 0.56.0 by @dependabot in #3413
- chore(deps): update foyer requirement from 0.16.1 to 0.17.0 by @dependabot in #3412
- chore: bringing dat integration testing in ahead of kernel replay by @rtyler in #3411
- chore: missed a version bump for core by @rtyler in #3415
- chore: include license file in deltalake-derive crate by @ankane in #3417
- chore(deps): bump foyer to v0.17.2 to prevent from wrong result by @MrCroxx in #3428
- chore: bump crate versions which are due for release by @rtyler in #3430
- chore: rely on the testing during coverage generation to speed up tests by @rtyler in #3431
- chore: make codecov more vigorously enforced to help ensure quality by @rtyler in #3434
- chore: prepare py-1.0 release by @ion-elgreco in #3435
- chore: experiment with using sccache in GitHub Actions by @rtyler in #3437
- chore: remove unused code and deps by @roeap in #3441
- chore: minor table module refactors by @rtyler in #3442
- docs: add 1.0.0 migration guide by @ion-elgreco in #3443
- refactor: more specific factory parameter names by @roeap in #3445
- refactor: use LogStore in Snapshot / LogSegment APIs by @roeap in #3452
- test: avoid circular dependency with core/test crates by @roeap in #3450
- chore: ensuring default builds work without datafusion by @rtyler in #3453
- ci: add spellchecker to pr tests by @roeap in #3457
- chore: mark more tests which require datafusion by @rtyler in #3458
- refactor: use full paths in log processing by @roeap in #3456
- chore: set correct markers by @ion-elgreco in #3469
- chore: update kernel by @roeap in #3462
- chore: remove unused time_utils by @roeap in #3470
- chore: more typos by @roeap in #3471
- refactor: remove protocol error by @roeap in #3473
- chore: remove unused stats_parsed field by @roeap in #3475
- chore: update migration docs by @ion-elgreco in #3479
- chore: update kernel to 0.11 by @roeap in #3480
- docs: fix bullet list formatting in dagster docs by @avriiil in #3483
- refactor!: get transaction versions for specific applications by @roeap in #3484
- test: improve storage config testing by @roeap in #3485
- chore: exclude Invariants from the default writer v2 feature set by @rtyler in #3486
New Contributors
- @Abhishek1005 made their first contribution in #3296
- @zeevm made their first contribution in #3332
- @riziles made their first contribution in #3343
- @DanielBertocci made their first contribution in #3357
- @roykim98 made their first contribution in #3301
- @zachschuermann made their first contribution in #3403
- @HiromuHota made their first contribution in #3404
- @linhr made their first contribution in #3420
- @MrCroxx made their first contribution in #3428
- @smeyerre made their first contribution in #3346
- @corwinjoy made their first contribution in #3423
Full Changelog: python-v0.25.5...python-v1.0.0