Skip to content

Commit 9fbbb7a

Browse files
MyroslavLevchykMyroslavLevchyk
authored andcommitted
feat: databricks workspace module
1 parent 33ed22f commit 9fbbb7a

File tree

9 files changed

+482
-2
lines changed

9 files changed

+482
-2
lines changed

README.md

Lines changed: 74 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,10 +1,82 @@
1-
# Azure <> Terraform module
2-
Terraform module for creation Azure <>
1+
# AWS Databricks Workspace Terraform module
2+
Terraform module for creation AWS Databricks Workspace
33

44
## Usage
55

66
<!-- BEGIN_TF_DOCS -->
7+
## Requirements
78

9+
| Name | Version |
10+
|------|---------|
11+
| <a name="requirement_terraform"></a> [terraform](#requirement\_terraform) | >= 1.8 |
12+
| <a name="requirement_aws"></a> [aws](#requirement\_aws) | >= 5.0 |
13+
| <a name="requirement_databricks"></a> [databricks](#requirement\_databricks) | >= 1.55 |
14+
| <a name="requirement_time"></a> [time](#requirement\_time) | ~> 0.11 |
15+
16+
## Providers
17+
18+
| Name | Version |
19+
|------|---------|
20+
| <a name="provider_aws"></a> [aws](#provider\_aws) | >= 5.0 |
21+
| <a name="provider_databricks"></a> [databricks](#provider\_databricks) | >= 1.55 |
22+
| <a name="provider_time"></a> [time](#provider\_time) | ~> 0.11 |
23+
24+
## Modules
25+
26+
| Name | Source | Version |
27+
|------|--------|---------|
28+
| <a name="module_iam_cross_account_workspace_policy"></a> [iam\_cross\_account\_workspace\_policy](#module\_iam\_cross\_account\_workspace\_policy) | terraform-aws-modules/iam/aws//modules/iam-policy | 5.41.0 |
29+
| <a name="module_iam_cross_account_workspace_role"></a> [iam\_cross\_account\_workspace\_role](#module\_iam\_cross\_account\_workspace\_role) | terraform-aws-modules/iam/aws//modules/iam-assumable-role | 5.41.0 |
30+
| <a name="module_privatelink_vpce"></a> [privatelink\_vpce](#module\_privatelink\_vpce) | ./modules/privatelink/ | n/a |
31+
| <a name="module_storage_configuration_dbfs_bucket"></a> [storage\_configuration\_dbfs\_bucket](#module\_storage\_configuration\_dbfs\_bucket) | terraform-aws-modules/s3-bucket/aws | 4.1.2 |
32+
33+
## Resources
34+
35+
| Name | Type |
36+
|------|------|
37+
| [aws_s3_bucket_policy.databricks_aws_bucket_policy](https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/s3_bucket_policy) | resource |
38+
| [databricks_mws_credentials.this](https://registry.terraform.io/providers/databricks/databricks/latest/docs/resources/mws_credentials) | resource |
39+
| [databricks_mws_networks.this](https://registry.terraform.io/providers/databricks/databricks/latest/docs/resources/mws_networks) | resource |
40+
| [databricks_mws_private_access_settings.this](https://registry.terraform.io/providers/databricks/databricks/latest/docs/resources/mws_private_access_settings) | resource |
41+
| [databricks_mws_storage_configurations.this](https://registry.terraform.io/providers/databricks/databricks/latest/docs/resources/mws_storage_configurations) | resource |
42+
| [databricks_mws_workspaces.this](https://registry.terraform.io/providers/databricks/databricks/latest/docs/resources/mws_workspaces) | resource |
43+
| [time_sleep.wait_30_seconds](https://registry.terraform.io/providers/hashicorp/time/latest/docs/resources/sleep) | resource |
44+
| [databricks_aws_assume_role_policy.this](https://registry.terraform.io/providers/databricks/databricks/latest/docs/data-sources/aws_assume_role_policy) | data source |
45+
| [databricks_aws_bucket_policy.this](https://registry.terraform.io/providers/databricks/databricks/latest/docs/data-sources/aws_bucket_policy) | data source |
46+
| [databricks_aws_crossaccount_policy.this](https://registry.terraform.io/providers/databricks/databricks/latest/docs/data-sources/aws_crossaccount_policy) | data source |
47+
48+
## Inputs
49+
50+
| Name | Description | Type | Default | Required |
51+
|------|-------------|------|---------|:--------:|
52+
| <a name="input_account_id"></a> [account\_id](#input\_account\_id) | Databricks Account ID | `string` | n/a | yes |
53+
| <a name="input_iam_cross_account_workspace_role_config"></a> [iam\_cross\_account\_workspace\_role\_config](#input\_iam\_cross\_account\_workspace\_role\_config) | Configuration object for setting the IAM cross-account role for the Databricks workspace | <pre>object({<br/> role_name = optional(string, null)<br/> policy_name = optional(string, null)<br/> permission_boundary_arn = optional(string, null)<br/> role_description = optional(string, "Databricks IAM Role to launch clusters in your AWS account, you must create a cross-account IAM role that gives access to Databricks.")<br/> })</pre> | `{}` | no |
54+
| <a name="input_iam_cross_account_workspace_role_enabled"></a> [iam\_cross\_account\_workspace\_role\_enabled](#input\_iam\_cross\_account\_workspace\_role\_enabled) | A boolean flag to determine if the cross-account IAM role for Databricks workspace access should be created | `bool` | `true` | no |
55+
| <a name="input_label"></a> [label](#input\_label) | A customizable string used as a prefix for naming Databricks resources | `string` | n/a | yes |
56+
| <a name="input_private_access_settings_config"></a> [private\_access\_settings\_config](#input\_private\_access\_settings\_config) | Configuration for private access settings | <pre>object({<br/> name = optional(string, null)<br/> allowed_vpc_endpoint_ids = optional(list(string), [])<br/> public_access_enabled = optional(bool, true)<br/> })</pre> | `{}` | no |
57+
| <a name="input_private_access_settings_enabled"></a> [private\_access\_settings\_enabled](#input\_private\_access\_settings\_enabled) | Indicates whether private access settings should be enabled for the Databricks workspace. Set to true to activate these settings | `bool` | `true` | no |
58+
| <a name="input_privatelink_dedicated_vpce_config"></a> [privatelink\_dedicated\_vpce\_config](#input\_privatelink\_dedicated\_vpce\_config) | Configuration object for AWS PrivateLink dedicated VPC Endpoints (VPCe) | <pre>object({<br/> rest_vpc_endpoint_name = optional(string, null)<br/> relay_vpc_endpoint_name = optional(string, null)<br/> rest_aws_vpc_endpoint_id = optional(string, null)<br/> relay_aws_vpc_endpoint_id = optional(string, null)<br/> })</pre> | `{}` | no |
59+
| <a name="input_privatelink_dedicated_vpce_enabled"></a> [privatelink\_dedicated\_vpce\_enabled](#input\_privatelink\_dedicated\_vpce\_enabled) | Boolean flag to enable or disable the creation of dedicated AWS VPC Endpoints (VPCe) for Databricks PrivateLink | `bool` | `false` | no |
60+
| <a name="input_privatelink_enabled"></a> [privatelink\_enabled](#input\_privatelink\_enabled) | Boolean flag to enabled registration of Privatelink VPC Endpoints (REST API and SCC Relay) in target Databricks Network Config | `bool` | `false` | no |
61+
| <a name="input_privatelink_relay_vpce_id"></a> [privatelink\_relay\_vpce\_id](#input\_privatelink\_relay\_vpce\_id) | AWS VPC Endpoint ID used for Databricks SCC Relay when PrivateLink is enabled | `string` | `null` | no |
62+
| <a name="input_privatelink_rest_vpce_id"></a> [privatelink\_rest\_vpce\_id](#input\_privatelink\_rest\_vpce\_id) | AWS VPC Endpoint ID used for Databricks REST API if PrivateLink is enabled | `string` | `null` | no |
63+
| <a name="input_region"></a> [region](#input\_region) | AWS region | `string` | n/a | yes |
64+
| <a name="input_security_group_ids"></a> [security\_group\_ids](#input\_security\_group\_ids) | Set of AWS security group IDs for Databricks Account network configuration | `set(string)` | n/a | yes |
65+
| <a name="input_storage_dbfs_config"></a> [storage\_dbfs\_config](#input\_storage\_dbfs\_config) | Configuration for the Databricks File System (DBFS) storage | <pre>object({<br/> bucket_name = optional(string)<br/> })</pre> | `{}` | no |
66+
| <a name="input_storage_dbfs_enabled"></a> [storage\_dbfs\_enabled](#input\_storage\_dbfs\_enabled) | Flag to enable or disable the use of DBFS (Databricks File System) storage in the Databricks workspace | `bool` | `true` | no |
67+
| <a name="input_subnet_ids"></a> [subnet\_ids](#input\_subnet\_ids) | Set of AWS subnet IDs for Databricks Account network configuration | `set(string)` | n/a | yes |
68+
| <a name="input_tags"></a> [tags](#input\_tags) | Assigned tags to AWS services | `map(string)` | `{}` | no |
69+
| <a name="input_vpc_id"></a> [vpc\_id](#input\_vpc\_id) | AWS VPC ID | `string` | n/a | yes |
70+
| <a name="input_workspace_creator_token_enabled"></a> [workspace\_creator\_token\_enabled](#input\_workspace\_creator\_token\_enabled) | Indicates whether to enable the creation of a token for workspace creators in Databricks | `bool` | `false` | no |
71+
72+
## Outputs
73+
74+
| Name | Description |
75+
|------|-------------|
76+
| <a name="output_iam_role"></a> [iam\_role](#output\_iam\_role) | n/a |
77+
| <a name="output_storage"></a> [storage](#output\_storage) | n/a |
78+
| <a name="output_workspace"></a> [workspace](#output\_workspace) | n/a |
79+
| <a name="output_workspace_url"></a> [workspace\_url](#output\_workspace\_url) | n/a |
880
<!-- END_TF_DOCS -->
981

1082
## License

main.tf

Lines changed: 160 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,160 @@
1+
################################################################################
2+
# Databricks Workspace
3+
################################################################################
4+
resource "databricks_mws_workspaces" "this" {
5+
account_id = var.account_id
6+
aws_region = var.region
7+
workspace_name = var.label
8+
credentials_id = databricks_mws_credentials.this.credentials_id
9+
storage_configuration_id = databricks_mws_storage_configurations.this.storage_configuration_id
10+
network_id = databricks_mws_networks.this.network_id
11+
private_access_settings_id = try(databricks_mws_private_access_settings.this[0].private_access_settings_id, null)
12+
13+
dynamic "token" {
14+
for_each = var.workspace_creator_token_enabled ? [1] : []
15+
content {
16+
comment = "Workspace creator token managed by Terraform"
17+
}
18+
}
19+
20+
lifecycle {
21+
replace_triggered_by = [databricks_mws_credentials.this]
22+
}
23+
24+
}
25+
26+
resource "databricks_mws_private_access_settings" "this" {
27+
count = var.private_access_settings_enabled ? 1 : 0
28+
29+
private_access_settings_name = coalesce(var.private_access_settings_config.name, var.label)
30+
region = var.region
31+
public_access_enabled = var.private_access_settings_config.public_access_enabled
32+
allowed_vpc_endpoint_ids = coalesce(var.private_access_settings_config.allowed_vpc_endpoint_ids, [var.privatelink_rest_vpce_id])
33+
private_access_level = "ENDPOINT"
34+
}
35+
36+
################################################################################
37+
# Network
38+
################################################################################
39+
resource "databricks_mws_networks" "this" {
40+
account_id = var.account_id
41+
network_name = var.label
42+
security_group_ids = var.security_group_ids
43+
subnet_ids = var.subnet_ids
44+
vpc_id = var.vpc_id
45+
46+
dynamic "vpc_endpoints" {
47+
for_each = var.privatelink_enabled ? [1] : []
48+
content {
49+
dataplane_relay = [coalesce(try(module.privatelink_vpce.relay_vpce_id, null), var.privatelink_relay_vpce_id)]
50+
rest_api = [coalesce(try(module.privatelink_vpce.rest_vpce_id, null), var.privatelink_rest_vpce_id)]
51+
}
52+
}
53+
}
54+
55+
################################################################################
56+
# Privatelink dedicated VPC Endpoints (REST/Relay)
57+
################################################################################
58+
module "privatelink_vpce" {
59+
count = var.privatelink_dedicated_vpce_enabled ? 1 : 0
60+
source = "./modules/privatelink/"
61+
62+
account_id = var.account_id
63+
region = var.region
64+
relay_vpc_endpoint_name = var.privatelink_dedicated_vpce_config.relay_vpc_endpoint_name
65+
relay_aws_vpc_endpoint_id = var.privatelink_dedicated_vpce_config.relay_aws_vpc_endpoint_id
66+
rest_vpc_endpoint_name = var.privatelink_dedicated_vpce_config.rest_vpc_endpoint_name
67+
rest_aws_vpc_endpoint_id = var.privatelink_dedicated_vpce_config.rest_aws_vpc_endpoint_id
68+
}
69+
70+
################################################################################
71+
# IAM
72+
################################################################################
73+
data "databricks_aws_assume_role_policy" "this" {
74+
external_id = var.account_id
75+
}
76+
77+
data "databricks_aws_crossaccount_policy" "this" {}
78+
79+
module "iam_cross_account_workspace_policy" {
80+
source = "terraform-aws-modules/iam/aws//modules/iam-policy"
81+
version = "5.41.0"
82+
83+
name = coalesce(var.iam_cross_account_workspace_role_config.policy_name, "${var.label}-dbx-crossaccount-policy")
84+
policy = data.databricks_aws_crossaccount_policy.this.json
85+
}
86+
87+
module "iam_cross_account_workspace_role" {
88+
count = var.iam_cross_account_workspace_role_enabled ? 1 : 0
89+
source = "terraform-aws-modules/iam/aws//modules/iam-assumable-role"
90+
version = "5.41.0"
91+
92+
role_name = coalesce(var.iam_cross_account_workspace_role_config.role_name, "${var.label}-dbx-cross-account")
93+
create_role = var.iam_cross_account_workspace_role_enabled
94+
create_custom_role_trust_policy = true
95+
custom_role_trust_policy = data.databricks_aws_assume_role_policy.this.json
96+
role_permissions_boundary_arn = var.iam_cross_account_workspace_role_config.permission_boundary_arn
97+
role_description = var.iam_cross_account_workspace_role_config.role_description
98+
custom_role_policy_arns = [module.iam_cross_account_workspace_policy.arn]
99+
tags = var.tags
100+
}
101+
102+
# It is required to wait up to 30 seconds after role creation so Databricks would successfuly reference it
103+
resource "time_sleep" "wait_30_seconds" {
104+
depends_on = [module.iam_cross_account_workspace_role]
105+
106+
create_duration = "30s"
107+
}
108+
109+
resource "databricks_mws_credentials" "this" {
110+
account_id = var.account_id
111+
credentials_name = "${var.label}-credentials"
112+
role_arn = module.iam_cross_account_workspace_role[0].iam_role_arn
113+
114+
depends_on = [time_sleep.wait_30_seconds]
115+
}
116+
117+
################################################################################
118+
# Storage Configuration
119+
################################################################################
120+
data "databricks_aws_bucket_policy" "this" {
121+
bucket = module.storage_configuration_dbfs_bucket[0].s3_bucket_id
122+
}
123+
124+
module "storage_configuration_dbfs_bucket" {
125+
count = var.storage_dbfs_enabled ? 1 : 0
126+
source = "terraform-aws-modules/s3-bucket/aws"
127+
version = "4.1.2"
128+
129+
bucket_prefix = coalesce(var.storage_dbfs_config.bucket_name, "${var.label}-dbfs-")
130+
acl = "private"
131+
132+
force_destroy = true
133+
134+
control_object_ownership = true
135+
object_ownership = "BucketOwnerPreferred"
136+
137+
server_side_encryption_configuration = {
138+
rule = {
139+
apply_server_side_encryption_by_default = {
140+
sse_algorithm = "AES256"
141+
}
142+
}
143+
}
144+
145+
versioning = {
146+
status = "Disabled"
147+
}
148+
149+
}
150+
151+
resource "aws_s3_bucket_policy" "databricks_aws_bucket_policy" {
152+
bucket = module.storage_configuration_dbfs_bucket[0].s3_bucket_id
153+
policy = data.databricks_aws_bucket_policy.this.json
154+
}
155+
156+
resource "databricks_mws_storage_configurations" "this" {
157+
account_id = var.account_id
158+
storage_configuration_name = var.label
159+
bucket_name = module.storage_configuration_dbfs_bucket[0].s3_bucket_id
160+
}

modules/privatelink/main.tf

Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,13 @@
1+
resource "databricks_mws_vpc_endpoint" "rest" {
2+
account_id = var.account_id
3+
aws_vpc_endpoint_id = var.rest_aws_vpc_endpoint_id
4+
vpc_endpoint_name = var.rest_vpc_endpoint_name
5+
region = var.region
6+
}
7+
8+
resource "databricks_mws_vpc_endpoint" "relay" {
9+
account_id = var.account_id
10+
aws_vpc_endpoint_id = var.relay_aws_vpc_endpoint_id
11+
vpc_endpoint_name = var.relay_vpc_endpoint_name
12+
region = var.region
13+
}

modules/privatelink/outputs.tf

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,9 @@
1+
output "rest_vpce_id" {
2+
value = databricks_mws_vpc_endpoint.rest.vpc_endpoint_id
3+
description = "The ID of the AWS VPC endpoint associated with the Databricks REST API"
4+
}
5+
6+
output "relay_vpce_id" {
7+
value = databricks_mws_vpc_endpoint.relay.vpc_endpoint_id
8+
description = "The ID of the AWS VPC endpoint associated with the Databricks Relay service"
9+
}

modules/privatelink/variables.tf

Lines changed: 28 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,28 @@
1+
variable "region" {
2+
type = string
3+
description = "AWS region"
4+
}
5+
6+
variable "rest_vpc_endpoint_name" {
7+
type = string
8+
description = "The name to assign to the AWS VPC endpoint for the Databricks REST API"
9+
}
10+
variable "rest_aws_vpc_endpoint_id" {
11+
type = string
12+
description = "The AWS VPC endpoint ID for the Databricks REST API"
13+
}
14+
15+
variable "relay_vpc_endpoint_name" {
16+
type = string
17+
description = "The name to assign to the AWS VPC endpoint for the Databricks Relay service"
18+
}
19+
20+
variable "relay_aws_vpc_endpoint_id" {
21+
type = string
22+
description = "The AWS VPC endpoint ID for the Databricks Relay service"
23+
}
24+
25+
variable "account_id" {
26+
type = string
27+
description = "Databricks Account ID"
28+
}

modules/privatelink/versions.tf

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,10 @@
1+
terraform {
2+
required_version = ">= 1.0"
3+
4+
required_providers {
5+
databricks = {
6+
source = "databricks/databricks"
7+
version = ">= 1.55"
8+
}
9+
}
10+
}

outputs.tf

Lines changed: 15 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,15 @@
1+
output "workspace" {
2+
value = databricks_mws_workspaces.this
3+
}
4+
5+
output "storage" {
6+
value = try(module.storage_configuration_dbfs_bucket[0], null)
7+
}
8+
9+
output "iam_role" {
10+
value = try(module.iam_cross_account_workspace_role[0], null)
11+
}
12+
13+
output "workspace_url" {
14+
value = databricks_mws_workspaces.this.workspace_url
15+
}

0 commit comments

Comments
 (0)