Support sending parameterized SQL queries to Databricks Jobs #26

jlaneve · 2023-03-17T23:50:51Z

Problem statement

Currently the DatabricksWorkflowTaskGroup only supports creating notebook tasks using the DatabricksNotebookOperator. While this feature unlocks all databricks python-based development (and to some extent SQL through spark.sql commands), it does not allow users to take advantage of the Databricks SQL, which would limit the flows that users can create.

To solve this, we should offer support for adding support for sql_task tasks.

sql_task tasks allow databricks to refer to query objects that have been created in the databricks SQL editor. These queries can be parameterized by the user at runtime.

Solving this issue would involve two steps:

The first step is to create a DatabricksSqlQueryOperator that expects a query ID instead of a SQL query. If run outside of a DatabricksWorkflowTaskgroup, this operator would be able to launch and monitor a SQL task on its own. The second step would be to create a convert_to_databricks_workflow_task to convert the SQL operator task into a workflow task.

For this task to be completed, a SQL query should be added to the example DAG and should run through CI/CD.

The text was updated successfully, but these errors were encountered:

jlaneve mentioned this issue Mar 17, 2023

Support sending parameterized SQL queries to Databricks Jobs astronomer/astronomer-cosmos#122

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support sending parameterized SQL queries to Databricks Jobs #26

Support sending parameterized SQL queries to Databricks Jobs #26

jlaneve commented Mar 17, 2023

Support sending parameterized SQL queries to Databricks Jobs #26

Support sending parameterized SQL queries to Databricks Jobs #26

Comments

jlaneve commented Mar 17, 2023

Problem statement