Skip to content

Monitor deployment fails after being deployed once through the bundle #2437

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
star-yar opened this issue Mar 5, 2025 · 8 comments
Open
Labels
Bug Something isn't working DABs DABs related issues

Comments

@star-yar
Copy link

star-yar commented Mar 5, 2025

Describe the issue

I specify the quality monitor for a table. It gets deployed through CI using the CLI call (databricks budle deploy -t our_target) under a service account. After the deployment is done, all other deploument attempts would fail with: Error: cannot create quality monitor: Data Monitor 'catalog.schema.table' already exists

Configuration

Steps to reproduce the behavior

Please list the steps required to reproduce the issue, for example:

  • Create a table catalog.schema.table in UC
  • Create a quality monitor for it in the bundle
  • Deploy once databricks bundle deploy (should succeed)
  • Deploy again databricks bundle deploy (should succeed but fails)

Expected Behavior

Monitor gets updated after first deployment (PUT-call)

Actual Behavior

Monitor gets updated after first deployment (POST-call)

OS and CLI version

OS: ubuntu 24.04.2
cli version: 0.224.1

@star-yar star-yar added the DABs DABs related issues label Mar 5, 2025
@andrewnester
Copy link
Contributor

First, could you try to upgrade to the latest CLI version and try it again.

Secondly, when you deploy from CI, do you deploy to the same workspace.root_path or the different ones for each deployments?

@star-yar
Copy link
Author

star-yar commented Mar 6, 2025

First, could you try to upgrade to the latest CLI version and try it again.

Updated to 0.243.0 – still fails

do you deploy to the same workspace.root_path or the different ones for each deployments?

same one

@gkinnell
Copy link

Seeing the same as OP on Ubuntu: 24.04 / Databricks CLI: 0.243.0

@andredmoliveira
Copy link

Facing same issue but only in local dev mode. CI/CD works as intended deploying with Service Principle.
For local development, after initial bundle deployment, tested setting quality monitor asset dir for both User and Shared in the workspace in mode: development but it fails as monitor already exists:

Updating deployment state...
Error: terraform apply: exit status 1

Error: failed to create monitor

  with databricks_quality_monitor.gcaa-commercial-attribution_quality_monitor,
  on bundle.tf.json line 450, in resource.databricks_quality_monitor.gcaa-commercial-attribution_quality_monitor:
 450:       }

Already exists Monitor with ID:

@andrewnester
Copy link
Contributor

@gkinnell @andredmoliveira thanks for chiming in. Could you please share your bundle YAML configuration so we can try to reproduce the issue on our side? Thank you!

@andredmoliveira
Copy link

Hi @andrewnester! Our bundle config file is very much in line with the mlops-stacks.

Default path for quality monitor asset dir is the data product folder in the workspace, and we've tried to add user name folder when deploying in dev workspace:

bundle:
  uuid: ...
  name: gcaa-commercial-attribution

workspace:
  root_path: /commercial_attribution/.bundle/${bundle.name}/${bundle.target}

variables:
  experiment_name:
    description: Experiment name for the model training.
    default: /commercial_attribution/.bundle/${bundle.name}/${bundle.target}-gcaa-commercial-attribution-experiment
  model_name:
    description: Model name for the model training.
    default: ${bundle.target}-gcaa-commercial-attribution-model
  catalog_name:
    description: The catalog name to save the trained model
  policy_id:
    description: Policy ID for cluster
  schema_name:
    description: The schema name linked to the Data Consumer Product
    default: commercial_attribution_gold
  quality_monitor:
    description: Quality Monitor Asset Directory
    default: /commercial_attribution/databricks_lakehouse_monitoring

include:
  - ./resources/batch-inference-workflow-resource.yml
  - ./resources/ml-artifacts-resource.yml
  - ./resources/model-workflow-resource.yml
  - ./resources/feature-engineering-workflow-resource.yml
  - ./resources/monitoring-resource.yml

targets:
  dev:
    mode: development
    default: true
    variables:
      catalog_name: dev_commercial
      policy_id: ...
      experiment_name: /commercial_attribution/${workspace.current_user.userName}/.bundle/${bundle.name}/${bundle.target}-gcaa-commercial-attribution-experiment
      # quality_monitor: /commercial_attribution/${workspace.current_user.userName}/databricks_lakehouse_monitoring
    workspace:
      host: ...
      root_path: /commercial_attribution/${workspace.current_user.userName}/.bundle/${bundle.name}/${bundle.target}

  test:
    variables:
      catalog_name: qa_commercial
      policy_id: ...
    workspace:
      host: ...

  staging:
    variables:
      catalog_name: qa_commercial
      policy_id: ...
    workspace:
      host: ...

  prod:
    variables:
      catalog_name: commercial
      policy_id: ...
    workspace:
      host: ...

@andrewnester
Copy link
Contributor

Facing same issue but only in local dev mode. CI/CD works as intended deploying with Service Principle.

Do you use the same CLI version on CI/CD and locally?

What if you prefix quality_monitor with /Workspace like quality_monitor: /Workspace/commercial_attribution/...?

@andredmoliveira
Copy link

Do you use the same CLI version on CI/CD and locally?

Same version 0.243.0

What if you prefix quality_monitor with /Workspace like quality_monitor: /Workspace/commercial_attribution/...?

Same error.

Judging by the behaviour of experiments and models when doing parallel local development, seems like not being able to prefix the quality monitor name or have different ID's in development mode, does not allow overwriting.

@andrewnester andrewnester added Bug Something isn't working and removed Response Requested labels Mar 31, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Something isn't working DABs DABs related issues
Projects
None yet
Development

No branches or pull requests

4 participants