Skip to content

Commit e89fd85

Browse files
authored
use dependency groups to seperate local spark and databricks-connect (#23)
* use dependency groups to seperate local spark and databricks-connect * remove dbt from local spark * update lock file * properly handle dev dependencies * ensure no-dev for local spark
1 parent 27a94be commit e89fd85

File tree

8 files changed

+1027
-866
lines changed

8 files changed

+1027
-866
lines changed

.github/workflows/ci.yml

Lines changed: 6 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -32,13 +32,13 @@ jobs:
3232
java-version: 17
3333
distribution: "zulu"
3434
- name: Install the project
35-
run: uv sync --locked --all-extras
35+
run: uv sync --locked --group dev-spark --no-dev
3636
- name: Run code checks
37-
run: uv run ruff check
37+
run: uv run --no-dev ruff check
3838
- name: Check code formatting
39-
run: uv run ruff format --check
39+
run: uv run --no-dev ruff format --check
4040
- name: Run tests
41-
run: uv run pytest -v
41+
run: uv run --no-dev pytest -v
4242

4343
ci-databricks:
4444
needs: ci-local
@@ -64,12 +64,8 @@ jobs:
6464
with:
6565
version: 0.280.0
6666
- name: Install the project
67-
run: uv sync --locked --all-extras
68-
- name: Install Databricks Connect
69-
run: |
70-
uv pip uninstall pyspark
71-
uv pip install databricks-connect==17.2.*
67+
run: uv sync --locked
7268
- name: Check Databricks CLI
7369
run: databricks current-user me
7470
- name: Run tests
75-
run: uv run --no-sync pytest -v
71+
run: uv run pytest -v

.github/workflows/deploy-test.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -34,7 +34,7 @@ jobs:
3434
with:
3535
version: 0.280.0
3636
- name: Install the project
37-
run: uv sync --locked --all-extras
37+
run: uv sync --locked
3838
- name: Check Databricks CLI
3939
run: databricks current-user me
4040
- name: Log Bundle Changes

.github/workflows/release.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -35,7 +35,7 @@ jobs:
3535
with:
3636
version: 0.280.0
3737
- name: Install the project
38-
run: uv sync --locked --all-extras
38+
run: uv sync --locked
3939
- name: Check Databricks CLI
4040
run: databricks current-user me
4141
- name: Log Bundle Changes

.github/workflows/validate-bundle.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -33,7 +33,7 @@ jobs:
3333
with:
3434
version: 0.280.0
3535
- name: Install the project
36-
run: uv sync --locked --all-extras
36+
run: uv sync --locked
3737
- name: Check Databricks CLI
3838
run: databricks current-user me
3939
- name: Validate Databricks Bundle

AGENTS.md

Lines changed: 5 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -10,16 +10,8 @@ This repo uses Databricks CLI to deploy a Databricks Asset Bundle.
1010
- `tests`: Unit tests for the Python project
1111

1212
## Setup commands
13-
- Install deps: `uv sync --locked --all-extras`
14-
- Run code checks: `uv run ruff check`
15-
- Check code formatting: `uv run ruff format --check`
16-
- Run tests: `uv run pytest -v`
17-
18-
## Unit-Tests
19-
- On Linux systems, run: `uv run pytest -v`
20-
- On Windows systems, run:
21-
```powershell
22-
uv pip uninstall pyspark
23-
uv pip install databricks-connect==17.2.*
24-
uv run --no-sync pytest -v
25-
```
13+
- Install deps: `uv sync --locked --group dev-spark --no-dev`
14+
- Run code checks: `uv run --no-dev ruff check`
15+
- Check code formatting: `uv run --no-dev ruff format --check`
16+
- Run tests: `uv run --no-dev pytest -v`
17+

README.md

Lines changed: 6 additions & 22 deletions
Original file line numberDiff line numberDiff line change
@@ -49,12 +49,12 @@ A script exists to set up the Workspace (Free Edition) as described in the [Setu
4949

5050
### Setup environment
5151

52-
Sync entire `uv` environment with all optional dependency groups:
52+
Sync `uv` environment with `dev` (includes databricks-connect) dependencies:
5353
```bash
54-
uv sync --all-extras
54+
uv sync --locked
5555
```
5656

57-
> **Note:** we install Databricks Connect in a follow-up step
57+
> **Note:** For local Spark use `uv sync --group dev-spark --no-dev` instead.
5858
5959
#### (Optional) Activate virtual environment
6060

@@ -70,35 +70,19 @@ Windows:
7070

7171
### Databricks Connect
7272

73-
Install `databricks-connect` in active environment. This requires authentication being set up via Databricks CLI.
74-
75-
```bash
76-
uv pip uninstall pyspark
77-
uv pip install databricks-connect==17.2.*
78-
```
79-
80-
**Option 2: Run with temporary dependency**
81-
```bash
82-
uv run --with databricks-connect==17.2.* pytest
83-
```
84-
85-
> **Note:** For Databricks Runtime Serverless v4
86-
73+
The `dev` dependency group includes `databricks-connect` for remote Spark execution. This requires authentication being set up via Databricks CLI.
8774

8875
See https://docs.databricks.com/aws/en/dev-tools/vscode-ext/ for using Databricks Connect extension in VS Code.
8976

9077
### Unit-Tests
9178

9279
```bash
93-
# in case databricks-connect is installed, --no-sync prevents reinstalling pyspark
94-
uv run --no-sync pytest -v
80+
uv run pytest -v
9581
```
9682

97-
Based on whether Databricks Connect is enabled or not the Unit-Tests use a Databricks Cluster or start a local Spark session with Delta support.
83+
Based on whether Databricks Connect or local Spark is installed, the Unit-Tests use a Databricks Cluster or start a local Spark session with Delta support.
9884
* On Databricks the unit-tests currently assume the catalog `lake_dev` exists.
9985

100-
> **Note:** For local Spark Java is required. On Windows Spark/Delta requires HADOOP libraries and generally does not run well, opt for `wsl` instead.
101-
10286
### Checks
10387

10488
```bash

pyproject.toml

Lines changed: 17 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -19,19 +19,31 @@ dependencies = [
1919
"databricks-sdk>=0.41, <0.68.0", # pinning from dbt-databricks
2020
]
2121

22-
[project.optional-dependencies]
22+
[dependency-groups]
23+
# Development with databricks-connect
24+
dev = [
25+
{ include-group = "databricks" },
26+
{ include-group = "dbt" },
27+
{ include-group = "tooling" },
28+
"databricks-connect>=17.0,<17.4",
29+
]
30+
# Development with local Spark
31+
dev-spark = [
32+
{ include-group = "databricks" },
33+
{ include-group = "tooling" },
34+
"delta-spark==4.0.*",
35+
"pyspark==4.0.*",
36+
]
2337
# Databricks runtime dependencies (preinstalled on cluster)
2438
databricks = [
25-
"delta-spark==4.0.*",
2639
"pydantic==2.10.6",
27-
"pyspark==4.0.*",
2840
]
2941
# dbt dependencies
3042
dbt = [
3143
"dbt-databricks==1.11.*",
3244
]
33-
# Development & Testing
34-
dev = [
45+
# Development tooling (shared across dev environments)
46+
tooling = [
3547
"databricks-bundles==0.280.*", # For Python-based Workflows
3648
"mypy", # Type hints
3749
"pip", # Databricks extension needs it

0 commit comments

Comments
 (0)