You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
use dependency groups to seperate local spark and databricks-connect (#23)
* use dependency groups to seperate local spark and databricks-connect
* remove dbt from local spark
* update lock file
* properly handle dev dependencies
* ensure no-dev for local spark
Copy file name to clipboardExpand all lines: README.md
+6-22Lines changed: 6 additions & 22 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -49,12 +49,12 @@ A script exists to set up the Workspace (Free Edition) as described in the [Setu
49
49
50
50
### Setup environment
51
51
52
-
Sync entire `uv` environment with all optional dependency groups:
52
+
Sync `uv` environment with `dev` (includes databricks-connect) dependencies:
53
53
```bash
54
-
uv sync --all-extras
54
+
uv sync --locked
55
55
```
56
56
57
-
> **Note:**we install Databricks Connect in a follow-up step
57
+
> **Note:**For local Spark use `uv sync --group dev-spark --no-dev` instead.
58
58
59
59
#### (Optional) Activate virtual environment
60
60
@@ -70,35 +70,19 @@ Windows:
70
70
71
71
### Databricks Connect
72
72
73
-
Install `databricks-connect` in active environment. This requires authentication being set up via Databricks CLI.
74
-
75
-
```bash
76
-
uv pip uninstall pyspark
77
-
uv pip install databricks-connect==17.2.*
78
-
```
79
-
80
-
**Option 2: Run with temporary dependency**
81
-
```bash
82
-
uv run --with databricks-connect==17.2.* pytest
83
-
```
84
-
85
-
> **Note:** For Databricks Runtime Serverless v4
86
-
73
+
The `dev` dependency group includes `databricks-connect` for remote Spark execution. This requires authentication being set up via Databricks CLI.
87
74
88
75
See https://docs.databricks.com/aws/en/dev-tools/vscode-ext/ for using Databricks Connect extension in VS Code.
89
76
90
77
### Unit-Tests
91
78
92
79
```bash
93
-
# in case databricks-connect is installed, --no-sync prevents reinstalling pyspark
94
-
uv run --no-sync pytest -v
80
+
uv run pytest -v
95
81
```
96
82
97
-
Based on whether Databricks Connect is enabled or not the Unit-Tests use a Databricks Cluster or start a local Spark session with Delta support.
83
+
Based on whether Databricks Connect or local Spark is installed, the Unit-Tests use a Databricks Cluster or start a local Spark session with Delta support.
98
84
* On Databricks the unit-tests currently assume the catalog `lake_dev` exists.
99
85
100
-
> **Note:** For local Spark Java is required. On Windows Spark/Delta requires HADOOP libraries and generally does not run well, opt for `wsl` instead.
0 commit comments