Skip to content

[DOCS-11631] BigQuery Cost Allocation #30722

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 24 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
28 changes: 19 additions & 9 deletions config/_default/menus/main.en.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -3617,26 +3617,36 @@ menu:
parent: cloud_cost
identifier: cloud_cost_datadog_costs
weight: 2
- name: Container Cost Allocation
url: cloud_cost_management/container_cost_allocation
- name: Cost Allocation
url: cloud_cost_management/cost_allocation
parent: cloud_cost
identifier: cloud_cost_container_cost_allocation
identifier: cloud_cost_cost_allocation
weight: 3
- name: Container Costs
url: cloud_cost_management/cost_allocation/container_cost_allocation/
parent: cloud_cost_cost_allocation
identifier: cloud_cost_container_cost_allocation
weight: 301
- name: AWS
url: cloud_cost_management/container_cost_allocation/?tab=aws
url: cloud_cost_management/cost_allocation/container_cost_allocation/?tab=aws
parent: cloud_cost_container_cost_allocation
identifier: cloud_cost_container_cost_allocation_aws
weight: 301
weight: 101
- name: Azure
url: cloud_cost_management/container_cost_allocation/?tab=azure
url: cloud_cost_management/cost_allocation/container_cost_allocation/?tab=azure
parent: cloud_cost_container_cost_allocation
identifier: cloud_cost_container_cost_allocation_azure
weight: 302
weight: 102
- name: Google Cloud
url: cloud_cost_management/container_cost_allocation/?tab=google
url: cloud_cost_management/cost_allocation/container_cost_allocation/?tab=google
parent: cloud_cost_container_cost_allocation
identifier: cloud_cost_container_cost_allocation_google
weight: 303
weight: 103
- name: BigQuery Costs
url: cloud_cost_management/cost_allocation/bigquery/
parent: cloud_cost_cost_allocation
identifier: cloud_cost_cost_allocationbigquery
weight: 302
- name: Custom Allocation Rules
url: cloud_cost_management/custom_allocation_rules
parent: cloud_cost
Expand Down
56 changes: 56 additions & 0 deletions content/en/cloud_cost_management/cost_allocation/_index.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,56 @@
---
title: Cost Allocation
description: Learn how to allocate cloud costs across your organization with Datadog Cloud Cost Management
further_reading:
- link: "/cloud_cost_management/"
tag: "Documentation"
text: "Learn about Cloud Cost Management"
- link: "/cloud_cost_management/cost_allocation/container_cost_allocation"
tag: "Documentation"
text: "Container Cost Allocation"
- link: "/cloud_cost_management/cost_allocation/bigquery"
tag: "Documentation"
text: "BigQuery Cost Allocation"
---

## Overview
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have no idea where this should go but can you add a section on bigquery labels? This is a feature bigquery provides where you can add a label to your bigquery queries and then the labels show up in CCM
It is super powerful for our users and not many know about it

It is documented here https://cloud.google.com/bigquery/docs/adding-labels

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll add a section for labels! This is definitely a useful feature 👍


Datadog Cloud Cost Management (CCM) provides comprehensive cost allocation capabilities that help you understand and optimize your cloud spending by breaking down costs across different resources and organizational dimensions. Cost allocation enables you to:

- **Track resource-level spending**: Allocate costs down to individual containers, pods, tasks, and data warehouse queries
- **Optimize resource utilization**: Identify idle resources and underutilized capacity
- **Chargeback and showback**: Attribute costs to specific teams, projects, or business units
- **Make informed decisions**: Understand the true cost of your applications and services

## Cost allocation methods

CCM offers multiple cost allocation methods to help you understand your cloud spending at different levels of granularity:

### Container cost allocation

Automatically allocate the costs of your cloud clusters to individual services and workloads running in those clusters. Use cost metrics enriched with tags from pods, nodes, containers, and tasks to visualize container workload cost in the context of your entire cloud bill.

Learn more about [Container Cost Allocation][1].

### BigQuery cost allocation

Allocate BigQuery costs to individual queries, users, and projects to understand your data warehouse spending at a granular level. Track query performance costs, storage costs, and slot utilization across your organization.

Learn more about [BigQuery Cost Allocation][2].

## Getting started

To get started with cost allocation:

1. **Set up Cloud Cost Management** by configuring your cloud provider integration on the [Cloud Cost Setup page][3].
2. **Enable container monitoring** by installing the Datadog Agent in your containerized environments.
3. **Configure tag extraction** for detailed cost breakdown.
4. **Set up BigQuery integration** for data warehouse insights.

## Further reading

{{< partial name="whats-next/whats-next.html" >}}

[1]: /cloud_cost_management/cost_allocation/container_cost_allocation
[2]: /cloud_cost_management/cost_allocation/bigquery
[3]: https://app.datadoghq.com/cost/setup
183 changes: 183 additions & 0 deletions content/en/cloud_cost_management/cost_allocation/bigquery.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,183 @@
---
title: BigQuery Cost Allocation
description: Learn how to allocate Cloud Cost Management spending across your organization with BigQuery Cost Allocation.
further_reading:
- link: "/cloud_cost_management/"
tag: "Documentation"
text: "Learn about Cloud Cost Management"
---

## Overview

Datadog Cloud Cost Management (CCM) automatically allocates the costs of your Google BigQuery resources to individual queries and workloads. Use cost metrics enriched with tags from queries, projects, and reservations to visualize BigQuery workload costs in the context of your entire cloud bill.

CCM displays costs for resources including query-level analysis, storage, and data transfer on the [**BigQuery dashboard**][1].

## BigQuery pricing models

BigQuery offers multiple pricing components, with CCM focusing on query-related processing costs.

### Query Processing

**On-demand queries**: You pay per query based on the amount of data processed.
- Costs are directly attributed to individual queries based on bytes processed
- Includes query-level tags for detailed cost attribution

**Reservation-based queries**: You purchase dedicated processing capacity (slots) in advance at a fixed cost. Multiple queries can share this reserved capacity, making cost attribution more complex but potentially more cost-effective for consistent workloads.
- Costs of reserved slots are allocated proportionally to queries using those slots
- Allocation based on slot consumption (`total_slot_ms`) per query
- Includes idle cost calculation for unused reservation capacity

**Other BigQuery Costs:**
- **Storage**: Charges for data stored in BigQuery tables (active and long-term storage)
- **Streaming**: Costs for real-time data ingestion via streaming inserts
- **Data Transfer**: Charges for moving data between regions or exporting data
- **BI Engine**: Costs for in-memory analytics acceleration
- **Other services**: ML training, routine executions, and additional BigQuery features

CCM allocates and enriches costs for both query-processing pricing models, providing detailed cost attribution and tagging for your BigQuery analysis workloads. Learn more about BigQuery services and pricing models [**here**][3].

[**Learn more about optimizing BigQuery performance and costs.**][8]

## Prerequisites

The following table presents the list of collected features and the minimal requirements:

| Feature | Requirements |
|---|---|
| Retrieve tags from labels of a query | GCP CCM costs must be setup. Supported without monitoring or reservations. |
| Query-Level Cost Attribution | BigQuery monitoring enabled |
| Reservation Cost Allocation | BigQuery reservations configured |

1. Configure the Google Cloud Cost Management integration on the [Cloud Cost Setup page][2].
2. Enable BigQuery monitoring in your Google Cloud project.
[**Enable BigQuery monitoring here**][4]
3. For reservation cost allocation, configure BigQuery reservations in your project. [**Learn about BigQuery reservations.**][7]

## Allocating costs

### Compute

Costs are allocated into the following spend types:

| Spend type | Description |
|---|---|
| `allocated_spend_type`: Usage | Cost of query execution based on bytes processed (on-demand) or slot consumption (reservation) |
| `allocated_spend_type`: Cluster_idle | Cost of reserved slots allocated within a project but not utilized by queries|

### Query-level tag extraction

When the [Datadog Google BigQuery integration][4] is enabled, CCM extracts the following tags to add to your query costs:

| Tag | Description |
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any other tags we can add? What about the tag that has the raw query id, also maybe mention the tag for the region and project id as well because those are super important for bigquery

Those 2 they might know already but customers in general can never find the tags they need, so I dont think it is a bad thing to be a little extra verbose here with the top tags they would care about

|---|---|
| `reservation_id` | The reservation pool that provided compute resources |
| `user_email` | The user or service account that executed the query |
| `dts_config_id` | Identifier for scheduled queries and data transfers |
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any way we can help customers understand how to take this value and use it to find their schedule in the bigquery ui? Otherwise this is just a random number that does not mean much to them

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep theres a sectino for this on the dashboard, i'll add the same section here. Thanks for pointing that out


To identify which BigQuery schedule a `DTS_CONFIG_ID` refers to:

1. Go to **BigQuery** in the [**GCP Console**][6].
2. Navigate to **Transfers > Schedules**.
3. Use the **search bar** or **Ctrl+F** to locate the `DTS_CONFIG_ID`.
4. Click the matched entry to view details about the query schedule, including source, frequency, and target dataset.

Additionally, CCM adds the following tags for cost analysis:

| Tag | Description |
|---|---|
| `allocated_spend_type` | Categorizes costs as either `usage` (active query execution) or `cluster_idle` (unused reservation capacity) |
| `allocated_resource` | Indicates resource measurement type - `slots` for reservation-based queries or `bytes_processed` for on-demand queries |
| `orchestrator` | Set to `BigQuery` for all BigQuery query-related records |

The tags below are automatically tagged from the billing data CCM processes and can be especially useful in BigQuery cost analysis:

| Tag | Description |
|---|---|
| `project_id` | GCP project ID where the BigQuery resource or job is located |
| `google_location` | The specific Google Cloud region or zone where BigQuery resources are deployed (e.g., us-central1, europe-west1, asia-southeast1) |
| `resource_name` | Full Google Cloud resource identifier |
Comment on lines +97 to +99
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are project_id and resource_name directly from bills? But I agree these tags are useful for customers to understand the cost allocation.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, they are directly from the bills -> CCM doesn't add them. I will split these into a subsection to clarify 👍


### Using BigQuery labels for cost attribution

BigQuery labels provide a powerful way to add custom metadata to your queries, jobs, datasets, and tables that automatically appear as tags in CCM. This enables highly granular cost attribution across teams, projects, applications, or any custom dimension you define.

**What are BigQuery labels?**
Labels are key-value pairs that you can attach to BigQuery resources. When you add labels to queries or jobs, they automatically become available as tags in CCM, allowing you to filter and group costs by these custom dimensions.

**Adding labels to queries:**
You can add labels to BigQuery queries using the `--label` flag with the `bq` command-line tool:

```bash
bq query --label department:engineering --label environment:production 'SELECT * FROM dataset.table'
```

**Adding labels in SQL sessions:**
For queries within a session, you can set labels that apply to all subsequent queries:

```sql
SET @@query_label = "team:data_science,cost_center:analytics";
```

**Benefits for cost management:**
- **Team attribution**: Tag queries with team names to track departmental BigQuery spending
- **Environment tracking**: Separate development, staging, and production costs
- **Application mapping**: Associate costs with specific applications or services
- **Project categorization**: Group costs by business initiatives or customer projects

Labels added to BigQuery resources automatically appear as tags in CCM, enabling powerful cost analysis and chargeback capabilities. [**Learn more about adding BigQuery labels**][10].

### Query-level allocation

Cost allocation divides BigQuery costs from GCP into individual queries and workloads associated with them. These divided costs are enriched with tags from queries, projects, and reservations so you can break down costs by any associated dimensions.

For reservation-based BigQuery costs, CCM allocates costs proportionally based on slot usage. Each query's cost is determined by its share of the total slot usage within the project's reservations. For example, if a query uses 25% of the total consumed slots in a project's reservation during a given period, it will be allocated 25% of that project's total reservation cost for that period. The cost per-query is calculated using the following formula:

```
cost_per_query = (query_slot_usage / total_slot_usage) * total_project_reservation_cost
```

Where:
- `query_slot_usage`: The number of slot-seconds consumed by an individual query
- `total_slot_usage`: The total slot-seconds used across all queries in the project's reservations
- `total_project_reservation_cost`: The total cost of the reservations in a given project for the time period

Any difference between the total billed reservation cost and the sum of allocated query costs is categorized as a project's idle cost, representing unused reservation capacity. These costs are tagged with `allocated_spend_type:cluster_idle`, while actual query execution costs (both reservation and on-demand) are tagged with `allocated_spend_type:usage`.

### Understanding idle costs

Idle costs represent the portion of reservation capacity that was paid for but not utilized by queries. These costs arise when the reserved slot capacity exceeds actual usage during a billing period.

**Idle slot sharing considerations**: If your organization has enabled idle slot sharing between reservations, the idle cost calculation may appear different than expected. When queries from one project use idle slots from another project's reservation, those slot costs are attributed as "free" rather than to the consuming project. This means:

- A project's reservation may show higher idle costs if other projects are using its unused capacity
- The original project pays full reservation costs regardless of cross-project usage
- No automatic cost-transfer: Sharing projects don't pay the reservation owner for consumed idle slots

[**Learn how to enable idle slot sharing for your reservations.**][5]

### Storage

Storage costs are categorized as:

| Spend type | Description |
|---|---|
| `google_usage_type`: Active Logical Storage | Includes any table or table partition that has been modified in the last 90 days |
| `google_usage_type`: Long Term Logical Storage | Includes any table or table partition that has not been modified for 90 consecutive days. The price of storage for that table automatically drops by approximately 50%. There is no difference in performance, durability, or availability between active and long-term storage |

[**Learn more about BigQuery storage and best practices.**][9]

## Further reading

{{< partial name="whats-next/whats-next.html" >}}

[1]: /dashboard/ecm-es8-agw/bigquery-allocation
[2]: /cost/setup
[3]: https://cloud.google.com/bigquery/pricing?hl=en
[4]: https://docs.datadoghq.com/integrations/google-cloud-bigquery/
[5]: https://cloud.google.com/bigquery/docs/reservations-tasks
[6]: https://console.cloud.google.com
[7]: https://cloud.google.com/bigquery/docs/reservations-intro
[8]: https://cloud.google.com/bigquery/docs/best-practices-performance-overview
[9]: https://cloud.google.com/bigquery/docs/best-practices-storage
[10]: https://cloud.google.com/bigquery/docs/adding-labels
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,8 @@
title: Container Cost Allocation
private: true
description: Learn how to allocate Cloud Cost Management spending across your organization with Container Cost Allocation.
aliases:
- /cloud_cost_management/container_cost_allocation
further_reading:
- link: "/cloud_cost_management/"
tag: "Documentation"
Expand Down
2 changes: 1 addition & 1 deletion layouts/partials/nav/left-nav.html
Original file line number Diff line number Diff line change
Expand Up @@ -67,7 +67,7 @@
</a>
{{ if .HasChildren }}

<ul class="list-unstyled sub-menu {{ if not (in (slice "automatic_instrumentation" "custom_instrumentation" "observability_pipelines_reference_processing_language" "observability_pipelines_vector_configuration" "cloud_cost_saas_cost_integrations" "csm_setup_enterprise" "csm_setup_pro" "csm_setup_cloud_workload_security" "cspm_frameworks_benchmarks" "cspm_findings_explorer" "cws_workload_security_rules" "otel_integrations" "otel_collector_configuration" "rum_dashboards"
<ul class="list-unstyled sub-menu {{ if not (in (slice "automatic_instrumentation" "custom_instrumentation" "observability_pipelines_reference_processing_language" "observability_pipelines_vector_configuration" "cloud_cost_container_cost_allocation" "cloud_cost_saas_cost_integrations" "csm_setup_enterprise" "csm_setup_pro" "csm_setup_cloud_workload_security" "cspm_frameworks_benchmarks" "cspm_findings_explorer" "cws_workload_security_rules" "otel_integrations" "otel_collector_configuration" "rum_dashboards"
"rum_browser_setup" "rum_session_replay_browser" "rum_session_replay_mobile" "rum_mobile_android" "rum_mobile_ios" "rum_mobile_flutter" "rum_mobile_kotlin" "rum_mobile_react_native" "rum_mobile_roku" "rum_mobile_unity" "pa_session_replay_mobile" "pa_session_replay_browser" "cloudcraft_api_aws_accounts" "cloudcraft_api_azure_accounts" "cloudcraft_api_blueprints" "cloudcraft_api_budgets" "cloudcraft_api_users" "appsec_enabling_single_step" "appsec_enabling_tracing_libraries" "synthetics_platform_dashboards" "synthetics_private_location" "synthetics_results_explorer" "ndm_netflow" "dashboards_ddsql_editor_reference" "application_security_software_composition_analysis_setup"
"application_security_code_security_setup" "appsec_threats_management_setup" "observability_pipelines_log_volume_control" "observability_pipelines_dual_ship_logs" "observability_pipelines_archive_logs" "observability_pipelines_split_logs" "observability_pipelines_sensitive_data_redaction" "observability_pipelines_log_enrichment" "csm_setup_agentless_scanning" "observability_pipelines_generate_metrics" "log_explorer_calculated_fields" "test_impact_analysis_setup" "dbm_setup_postgres_rds" "sca_setup_runtime" "ndm_setup" "otel-setup-collector-exporter" "otel-setup-intake-endpoint" "otel_guides_migration" "otel-api-dd-sdk" "otel-setup-agent" "ide_plugins_idea" "otel_guides_migration" "agent_configuration_proxy"
"software_catalog_set_up" "software_catalog_entity_model" "asm_serverless") .Identifier) }}d-none{{ end }}">
Expand Down
Loading