Skip to content

[ISSUE] Issue with databricks_spark_version resource not returning the latest_lts #5218

@MBeuttler

Description

@MBeuttler

Configuration

Example one:

terraform {
  required_providers {
    databricks = {
      source = "databricks/databricks"
    }
  }
}

data "databricks_spark_version" "latest_lts" {
  latest            = true
  long_term_support = true
  ml                = false
  genomics          = false
  gpu               = false
}

data "databricks_cluster_policy" "this" {
  count = var.policy_name != null ? 1 : 0
  name  = var.policy_name
}

resource "databricks_cluster" "this" {
  cluster_name            = var.name
  policy_id               = var.policy_name != null ? data.databricks_cluster_policy.this[0].id : null
  data_security_mode      = var.data_security_mode
  driver_node_type_id     = var.driver_node_type_id
  node_type_id            = var.node_type_id
  spark_version           = data.databricks_spark_version.latest_lts.id
  autotermination_minutes = var.autotermination_minutes
  is_pinned               = var.is_pinned
  runtime_engine          = var.runtime_engine
  custom_tags             = var.custom_tags
  num_workers             = var.num_workers != null ? var.num_workers : null
  dynamic "autoscale" {
    for_each = var.autoscale_min_workers != null && var.autoscale_max_workers != null ? [1] : []
    content {
      min_workers = var.autoscale_min_workers
      max_workers = var.autoscale_max_workers
    }
  }
}

Example two (manually setting the Scala version to 2.13):

terraform {
  required_providers {
    databricks = {
      source = "databricks/databricks"
    }
  }
}

data "databricks_spark_version" "latest_lts" {
  latest            = true
  long_term_support = true
  ml                = false
  genomics          = false
  gpu               = false
  scala             = "2.13"
}

data "databricks_cluster_policy" "this" {
  count = var.policy_name != null ? 1 : 0
  name  = var.policy_name
}

resource "databricks_cluster" "this" {
  cluster_name            = var.name
  policy_id               = var.policy_name != null ? data.databricks_cluster_policy.this[0].id : null
  data_security_mode      = var.data_security_mode
  driver_node_type_id     = var.driver_node_type_id
  node_type_id            = var.node_type_id
  spark_version           = data.databricks_spark_version.latest_lts.id
  autotermination_minutes = var.autotermination_minutes
  is_pinned               = var.is_pinned
  runtime_engine          = var.runtime_engine
  custom_tags             = var.custom_tags
  num_workers             = var.num_workers != null ? var.num_workers : null
  dynamic "autoscale" {
    for_each = var.autoscale_min_workers != null && var.autoscale_max_workers != null ? [1] : []
    content {
      min_workers = var.autoscale_min_workers
      max_workers = var.autoscale_max_workers
    }
  }
}

Expected Behavior

"databricks_spark_version" "latest_lts" should by default return the latest LTS independent of the Scala version, or there should be an option to filter for a specific Scala version when required.
latest_lts should be in sync with the Databricks cluster policy, and deployments should not fail when defining latest_lts in both.

Actual Behavior

"databricks_spark_version" "latest_lts" by default is filtering for Scala 2.12, but the 17.x LTS has only support for Scala 2.13; it returns 16.4 as the latest LTS.

When defining the filter for Scala 2.13 manually, it fails the policy check as the Databricks policy also does not return the correct LTS.

Steps to Reproduce

  1. terraform plan -> No error since policy is only checked in apply
  2. terraform apply -> Error as shown above since auto:latest-lts does not include 17.3 LTS

Terraform and provider versions

1.97

Is it a regression?

N/A

Debug Output

module.cluster_compute["Shared Central Cluster"].databricks_cluster.this: Modifying... [id=0630-140135-jjrzdty3]
╷
│ Error: cannot update cluster: Validation failed for spark_version, needs to be one of (17.2.x-scala2.13, 16.4.x-scala2.12, 17.2.x-cpu-ml-scala2.13, 16.4.x-cpu-ml-scala2.12, 16.4.x-scala2.12, 15.4.x-scala2.12, 16.4.x-cpu-ml-scala2.12, 15.4.x-cpu-ml-scala2.12) (is an element in "List(17.3.x-scala2.13)")
│ 
│   with module.cluster_compute["Shared Central Cluster"].databricks_cluster.this,
│   on ../modules/cluster_compute/main.tf line 38, in resource "databricks_cluster" "this":38: resource "databricks_cluster" "this" {
│ 

Important Factoids

N/A

Would you like to implement a fix?

No

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions