Skip to content

Initial implementation of "direct" deployment backend #2926

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 40 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
40 commits
Select commit Hold shift + click to select a range
d378b22
WIP - deploying without terraform
denik Apr 4, 2025
de01085
shortenv env var names to avoid "The directory name is invalid" on wi…
denik May 26, 2025
c38b9fa
disable acc/cloud tests that don't work yet
denik May 26, 2025
e26060e
clean up
denik May 26, 2025
631fd4b
WIP: destroy mutator + test (pipelines only); Graph is now generic
denik May 26, 2025
838f58a
destroy for jobs + test
denik May 26, 2025
4e15bca
update github action to use ENVFILTER
denik May 27, 2025
41bf928
lint fix
denik May 27, 2025
a2d698d
do not create empty resources.json
denik May 27, 2025
da61418
clean up env var from tests
denik May 27, 2025
e15b814
enable test
denik May 27, 2025
97dfb52
disable dashboard tests
denik May 27, 2025
2858259
fix script
denik May 28, 2025
2e67b61
enable run-local test
denik May 28, 2025
a5f2472
disable jobs/check-metadata, mlops-stacks
denik May 30, 2025
acfa3c8
disable 2 more integration tests
denik May 30, 2025
9b3f2b0
disable volumes
denik May 30, 2025
98c9e94
comment why test is disabled
denik May 30, 2025
93e6783
Add resource_types.go + test
denik May 30, 2025
7df6250
remove switch from GetResourceConfig
denik May 30, 2025
9ac99ed
rename terranova_{state,resources} to tn{state,resources}
denik May 30, 2025
7046c3b
clean up unused functions from testserver
denik May 30, 2025
e8178cb
add map with New functions that records resource types
denik May 30, 2025
95a84dc
update comments
denik May 30, 2025
38f8108
clean up
denik May 30, 2025
83d774d
clean up
denik May 30, 2025
af7d514
lint fix
denik May 30, 2025
5b1025c
resourceConstructors -> supportedResources
denik May 30, 2025
7225442
extract IsDirectDeployment function
denik Jun 2, 2025
ee74ff2
dag: maintain insertion order, do not sort and do not require Less()
denik Jun 2, 2025
b36d55e
simplify cycle message
denik Jun 2, 2025
f6f303a
rename libs/dag to libs/dagrun
denik Jun 2, 2025
13b449d
clean up; do not ignore errors
denik Jun 2, 2025
e3335f0
post-rebase fix of test.toml + update outputs
denik Jun 3, 2025
9cd15e0
post rebase - restore acceptance/bundle/run/basic/test.toml
denik Jun 3, 2025
d59d9cc
clean up warning
denik Jun 5, 2025
618b8d6
clean up; add comments; add error wrapping; use atomic
denik Jun 6, 2025
02b5abb
add comments
denik Jun 6, 2025
0d222dc
rm dec.DisallowUnknownFields
denik Jun 6, 2025
46241f4
state: use GetResourceEntry to replace both GetResourceID and GetSave…
denik Jun 6, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
13 changes: 13 additions & 0 deletions .github/workflows/push.yml
Original file line number Diff line number Diff line change
Expand Up @@ -42,15 +42,24 @@ jobs:
- macos-latest
- ubuntu-latest
- windows-latest
deployment:
- "terraform"
- "direct"

steps:
- name: Checkout repository and submodules
uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2

- name: Create deployment-specific cache identifier
run: echo "${{ matrix.deployment }}" > deployment-type.txt

- name: Setup Go
uses: actions/setup-go@d35c59abb061a4a6fb18e82ac0862c26744d6ab5 # v5.5.0
with:
go-version-file: go.mod
cache-dependency-path: |
go.sum
deployment-type.txt

- name: Setup Python
uses: actions/setup-python@a26af69be951a213d495a4c3e4e4022e16d87065 # v5.6.0
Expand All @@ -72,11 +81,15 @@ jobs:
# and would like to run the tests as fast as possible. We run it on schedule as well, because that is what
# populates the cache and cache may include test results.
if: ${{ github.event_name == 'pull_request' || github.event_name == 'schedule' }}
env:
ENVFILTER: DATABRICKS_CLI_DEPLOYMENT=${{ matrix.deployment }}
run: make test

- name: Run tests with coverage
# Still run 'make cover' on push to main and merge checks to make sure it does not get broken.
if: ${{ github.event_name != 'pull_request' && github.event_name != 'schedule' }}
env:
ENVFILTER: DATABRICKS_CLI_DEPLOYMENT=${{ matrix.deployment }}
run: make cover

- name: Analyze slow tests
Expand Down
17 changes: 16 additions & 1 deletion acceptance/bin/read_id.py
Original file line number Diff line number Diff line change
Expand Up @@ -29,4 +29,19 @@ def print_resource_terraform(section, name):
return


print_resource_terraform(*sys.argv[1:])
def print_resource_terranova(section, name):
filename = ".databricks/bundle/default/resources.json"
raw = open(filename).read()
data = json.loads(raw)
resources = data["resources"].get(section, {})
result = resources.get(name)
if result is None:
print(f"Resource {section=} {name=} not found. Available: {raw}")
return
print(result.get("__id__"))


if os.environ.get("DATABRICKS_CLI_DEPLOYMENT") == "direct":
print_resource_terranova(*sys.argv[1:])
else:
print_resource_terraform(*sys.argv[1:])
26 changes: 21 additions & 5 deletions acceptance/bin/read_state.py
Original file line number Diff line number Diff line change
Expand Up @@ -13,17 +13,15 @@
def print_resource_terraform(section, name, *attrs):
resource_type = "databricks_" + section[:-1]
filename = ".databricks/bundle/default/terraform/terraform.tfstate"
data = json.load(open(filename))
available = []
raw = open(filename).read()
data = json.loads(raw)
found = 0
for r in data["resources"]:
r_type = r["type"]
r_name = r["name"]
if r_type != resource_type:
available.append((r_type, r_name))
continue
if r_name != name:
available.append((r_type, r_name))
continue
for inst in r["instances"]:
attribute_values = inst.get("attributes")
Expand All @@ -35,4 +33,22 @@ def print_resource_terraform(section, name, *attrs):
print(f"State not found for {section}.{name}")


print_resource_terraform(*sys.argv[1:])
def print_resource_terranova(section, name, *attrs):
filename = ".databricks/bundle/default/resources.json"
raw = open(filename).read()
data = json.loads(raw)
resources = data["resources"].get(section, {})
result = resources.get(name)
if result is None:
print(f"State not found for {section}.{name}")
return
state = result["state"]
state.setdefault("id", result.get("__id__"))
values = [f"{x}={state.get(x)!r}" for x in attrs]
print(section, name, " ".join(values))


if os.environ.get("DATABRICKS_CLI_DEPLOYMENT") == "direct":
print_resource_terranova(*sys.argv[1:])
else:
print_resource_terraform(*sys.argv[1:])
3 changes: 3 additions & 0 deletions acceptance/bundle/artifacts/whl_dynamic/test.toml
Original file line number Diff line number Diff line change
@@ -1,3 +1,6 @@
# Terraform sorts tasks
EnvMatrix.DATABRICKS_CLI_DEPLOYMENT = ["terraform"]

[[Repls]]
Old = '\\\\'
New = '/'
Expand Down
Original file line number Diff line number Diff line change
@@ -1,4 +1,5 @@
BundleConfig.default_name = ""
EnvMatrix.DATABRICKS_CLI_DEPLOYMENT = ["terraform"] # need to sort tasks by key

[[Repls]]
Old = '\\'
Expand Down
3 changes: 3 additions & 0 deletions acceptance/bundle/debug/test.toml
Original file line number Diff line number Diff line change
@@ -1,3 +1,6 @@
# Debug output is naturally different. TODO: split debug tests in two: terraform and terranova
EnvMatrix.DATABRICKS_CLI_DEPLOYMENT = ["terraform"]

[[Repls]]
# The keys are unsorted and also vary per OS
Old = 'Environment variables for Terraform: ([A-Z_ ,]+) '
Expand Down
1 change: 1 addition & 0 deletions acceptance/bundle/deploy/dashboard/test.toml
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
EnvMatrix.DATABRICKS_CLI_DEPLOYMENT = ["terraform"] # dashboard not supported yet
3 changes: 3 additions & 0 deletions acceptance/bundle/deploy/fail-on-active-runs/test.toml
Original file line number Diff line number Diff line change
@@ -1,5 +1,8 @@
RecordRequests = true

# --fail-on-active-runs not implemented yet
EnvMatrix.DATABRICKS_CLI_DEPLOYMENT = ["terraform"]

[[Server]]
Pattern = "GET /api/2.2/jobs/runs/list"
Response.Body = '''
Expand Down
2 changes: 2 additions & 0 deletions acceptance/bundle/deploy/jobs/check-metadata/test.toml
Original file line number Diff line number Diff line change
@@ -1,6 +1,8 @@
Local = false
Cloud = true

EnvMatrix.DATABRICKS_CLI_DEPLOYMENT = ["terraform"] # require "bundle summary"

Ignore = [
"databricks.yml",
"a/b/resources.yml",
Expand Down
Original file line number Diff line number Diff line change
@@ -1,6 +1,8 @@
Local = true
Cloud = true

EnvMatrix.DATABRICKS_CLI_DEPLOYMENT = ["terraform"] # needs investigation Error: deploying jobs.foo: Method=Jobs.Create *retries.Err *apierr.APIError StatusCode=400 ErrorCode="INVALID_PARAMETER_VALUE" Message="Missing required field: settings.tasks.task_key."

Ignore = [
"databricks.yml",
]
2 changes: 2 additions & 0 deletions acceptance/bundle/deploy/mlops-stacks/test.toml
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,8 @@ Local=false

Badness = "the newly initialized bundle from the 'mlops-stacks' template contains two validation warnings in the configuration"

EnvMatrix.DATABRICKS_CLI_DEPLOYMENT = ["terraform"] # requires "bundle summary"

Ignore = [
"config.json"
]
Expand Down
2 changes: 2 additions & 0 deletions acceptance/bundle/deploy/pipeline/auto-approve/test.toml
Original file line number Diff line number Diff line change
@@ -1,6 +1,8 @@
Local = true
Cloud = true

EnvMatrix.DATABRICKS_CLI_DEPLOYMENT = ["terraform"] # requires "bundle summary"

Ignore = [
"databricks.yml"
]
2 changes: 2 additions & 0 deletions acceptance/bundle/deploy/pipeline/recreate/test.toml
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,8 @@ Local = true
Cloud = true
RequiresUnityCatalog = true

EnvMatrix.DATABRICKS_CLI_DEPLOYMENT = ["terraform"]

Ignore = [
"databricks.yml"
]
2 changes: 2 additions & 0 deletions acceptance/bundle/deploy/schema/auto-approve/test.toml
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,8 @@ Local = true
Cloud = true
RequiresUnityCatalog = true

EnvMatrix.DATABRICKS_CLI_DEPLOYMENT = ["terraform"] # requires "bundle summary"

Ignore = [
"databricks.yml",
"test-file-*.txt",
Expand Down
2 changes: 2 additions & 0 deletions acceptance/bundle/deploy/secret-scope/test.toml
Original file line number Diff line number Diff line change
@@ -1,6 +1,8 @@
Cloud = true
Local = true

EnvMatrix.DATABRICKS_CLI_DEPLOYMENT = ["terraform"]

Ignore = [
"databricks.yml",
]
Expand Down
2 changes: 2 additions & 0 deletions acceptance/bundle/deploy/volume/recreate/test.toml
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,8 @@ Local = false
Cloud = true
RequiresUnityCatalog = true

EnvMatrix.DATABRICKS_CLI_DEPLOYMENT = ["terraform"] # volumes are not supported

Ignore = [
"databricks.yml",
]
2 changes: 2 additions & 0 deletions acceptance/bundle/deployment/bind/test.toml
Original file line number Diff line number Diff line change
@@ -1,2 +1,4 @@
BundleConfig.default_name.bundle.name = "test-bundle-$UNIQUE_NAME"
BundleConfigTarget = "databricks.yml.tmpl"

EnvMatrix.DATABRICKS_CLI_DEPLOYMENT = ["terraform"]
2 changes: 2 additions & 0 deletions acceptance/bundle/destroy/jobs-and-pipeline/test.toml
Original file line number Diff line number Diff line change
@@ -1,6 +1,8 @@
Local = false
Cloud = true

EnvMatrix.DATABRICKS_CLI_DEPLOYMENT = ["terraform"] # requires "bundle summary"

Ignore = [
"databricks.yml",
"resources.yml",
Expand Down
2 changes: 2 additions & 0 deletions acceptance/bundle/python/restricted-execution/test.toml
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
# "bundle summary" is not implemented
EnvMatrix.DATABRICKS_CLI_DEPLOYMENT = ["terraform"]
15 changes: 7 additions & 8 deletions acceptance/bundle/resources/apps/output.txt
Original file line number Diff line number Diff line change
Expand Up @@ -7,12 +7,12 @@ Deployment complete!

>>> print_requests
{
"method": "POST",
"path": "/api/2.0/apps",
"body": {
"description": "my_app_description",
"name": "myapp"
}
},
"method": "POST",
"path": "/api/2.0/apps"
}
apps myapp name='myapp' description='my_app_description'

Expand All @@ -27,12 +27,11 @@ Deployment complete!

>>> print_requests
{
"method": "PATCH",
"path": "/api/2.0/apps/myapp",
"body": {
"description": "MY_APP_DESCRIPTION",
"name": "myapp",
"url": "myapp-123.cloud.databricksapps.com"
}
"name": "myapp"
},
"method": "PATCH",
"path": "/api/2.0/apps/myapp"
}
apps myapp name='myapp' description='MY_APP_DESCRIPTION'
3 changes: 2 additions & 1 deletion acceptance/bundle/resources/apps/script
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
print_requests() {
jq 'select(.method != "GET" and (.path | contains("/apps")))' < out.requests.txt
# url is output-only field that terraform adds but that is ignored by the backend
jq --sort-keys 'select(.method != "GET" and (.path | contains("/apps"))) | (.body.url = null | del(.body.url))' < out.requests.txt
rm out.requests.txt
}

Expand Down
60 changes: 60 additions & 0 deletions acceptance/bundle/resources/jobs/output.txt
Original file line number Diff line number Diff line change
Expand Up @@ -88,3 +88,63 @@ Deployment complete!
"path": "/api/2.2/jobs/reset"
}
jobs foo id='[NUMID]' name='foo'

=== Fetch job ID and verify remote state
>>> [CLI] jobs get [NUMID]
{
"job_id":[NUMID],
"settings": {
"deployment": {
"kind":"BUNDLE",
"metadata_file_path":"/Workspace/Users/[USERNAME]/.bundle/test-bundle/default/state/metadata.json"
},
"edit_mode":"UI_LOCKED",
"format":"MULTI_TASK",
"job_clusters": [
{
"job_cluster_key":"key",
"new_cluster": {
"num_workers":0,
"spark_version":"13.3.x-scala2.12"
}
}
],
"max_concurrent_runs":1,
"name":"foo",
"queue": {
"enabled":true
},
"trigger": {
"pause_status":"UNPAUSED",
"periodic": {
"interval":1,
"unit":"HOURS"
}
}
}
}

=== Destroy the job and verify that it's removed from the state and from remote
>>> [CLI] bundle destroy --auto-approve
The following resources will be deleted:
delete job foo

All files and directories at the following location will be deleted: /Workspace/Users/[USERNAME]/.bundle/test-bundle/default

Deleting files...
Destroy complete!

>>> print_requests
{
"body": {
"job_id": [NUMID]
},
"method": "POST",
"path": "/api/2.2/jobs/delete"
}
State not found for jobs.foo

>>> musterr [CLI] jobs get [NUMID]
Error: Not Found

Exit code (musterr): 1
14 changes: 14 additions & 0 deletions acceptance/bundle/resources/jobs/script
Original file line number Diff line number Diff line change
Expand Up @@ -13,3 +13,17 @@ title "Update trigger.periodic.unit and re-deploy"
trace update_file.py databricks.yml DAYS HOURS
trace $CLI bundle deploy
trace print_requests

title "Fetch job ID and verify remote state"

ppid=`read_id.py jobs foo`

trace $CLI jobs get $ppid
rm out.requests.txt

title "Destroy the job and verify that it's removed from the state and from remote"
trace $CLI bundle destroy --auto-approve
trace print_requests

trace musterr $CLI jobs get $ppid
rm out.requests.txt
26 changes: 1 addition & 25 deletions acceptance/bundle/resources/pipelines/output.txt
Original file line number Diff line number Diff line change
Expand Up @@ -36,31 +36,7 @@ Deploying resources...
Updating deployment state...
Deployment complete!

>>> print_requests
{
"body": {
"channel": "CURRENT",
"deployment": {
"kind": "BUNDLE",
"metadata_file_path": "/Workspace/Users/[USERNAME]/.bundle/acc-[UNIQUE_NAME]/default/state/metadata.json"
},
"edition": "ADVANCED",
"id": "[UUID]",
"libraries": [
{
"file": {
"path": "/Workspace/Users/[USERNAME]/.bundle/acc-[UNIQUE_NAME]/default/files/bar.py"
}
}
],
"name": "test-pipeline-[UNIQUE_NAME]",
"storage": "dbfs:/pipelines/[UUID]"
},
"method": "PUT",
"path": "/api/2.0/pipelines/[UUID]"
}
pipelines my id='[UUID]' name='test-pipeline-[UNIQUE_NAME]'

=== Fetch pipeline ID and verify remote state
>>> [CLI] pipelines get [UUID]
{
"creator_user_name":"[USERNAME]",
Expand Down
Loading
Loading