From 805d955290e377f69a0c89965678700a3ea3af27 Mon Sep 17 00:00:00 2001
From: Wong Songhan <songhan89@gmail.com>
Date: Mon, 21 Apr 2025 15:01:09 +0000
Subject: [PATCH] Add Databricks setup instructions.

---
 docs/examples/how-to-run.md | 22 ++++++++++++++++++++++
 1 file changed, 22 insertions(+)

diff --git a/docs/examples/how-to-run.md b/docs/examples/how-to-run.md
index f78e6beb3..b2c5c1351 100644
--- a/docs/examples/how-to-run.md
+++ b/docs/examples/how-to-run.md
@@ -26,3 +26,25 @@ cd lithops/aws  # or whichever executor/cloud combination you are using
 |           | Google | [modal/gcp/README.md](https://github.com/cubed-dev/cubed/blob/main/examples/modal/gcp/README.md)     |
 | Coiled    | AWS    | [coiled/aws/README.md](https://github.com/cubed-dev/cubed/blob/main/examples/coiled/aws/README.md)   |
 | Beam      | Google | [dataflow/README.md](https://github.com/cubed-dev/cubed/blob/main/examples/dataflow/README.md)       |
+
+## Databricks
+
+If you want to run Cubed on Databricks, we recommend using the Spark executor (experimental stage, see [#499](https://github.com/cubed-dev/cubed/issues/499)).
+
+You will need to setup your compute cluster with [Dedicated Access Mode](https://docs.databricks.com/aws/en/compute/single-user-fgac), as Spark executor requires use of Spark RDDs that are not supported by [Serverless](https://docs.databricks.com/aws/en/compute/serverless/limitations#limitations-overview) or [Standard mode](https://docs.databricks.com/aws/en/compute/access-mode-limitations#standard-access-mode-limitations-on-unity-catalog). 
+
+### Configuration
+
+Note that if you are using a local directory for `work_dir`, you can only use a single node Spark cluster since the Spark worker nodes will not have access to your driver node local directory.
+
+Using Unity Catalog Volume is not recommended for `work_dir` since it is significantly slower.
+
+```py
+spec = cubed.Spec(
+    executor_name="spark",
+    work_dir="/tmp/", # this is using local directory of the driver node, your cluster will need to run in single node
+    allowed_mem="2GB"
+)
+```
+
+