Skip to content

Commit 92c16f3

Browse files
authored
Feat: Add plan option to always compare against prod (#4615)
1 parent 587e0ea commit 92c16f3

File tree

6 files changed

+232
-6
lines changed

6 files changed

+232
-6
lines changed

docs/guides/configuration.md

Lines changed: 111 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -381,6 +381,117 @@ Example showing default values:
381381
)
382382
```
383383

384+
385+
### Always comparing against production
386+
387+
By default, SQLMesh compares the current state of project files to the target `<env>` environment when `sqlmesh plan <env>` is run. However, a common expectation is that local changes should always be compared to the production environment.
388+
389+
The `always_recreate_environment` boolean plan option can alter this behavior. When enabled, SQLMesh will always attempt to compare against the production environment by recreating the target environment; If `prod` does not exist, SQLMesh will fall back to comparing against the target environment.
390+
391+
**NOTE:**: Upon succesfull plan application, changes are still promoted to the target `<env>` environment.
392+
393+
=== "YAML"
394+
395+
```yaml linenums="1"
396+
plan:
397+
always_recreate_environment: True
398+
```
399+
400+
=== "Python"
401+
402+
```python linenums="1"
403+
from sqlmesh.core.config import (
404+
Config,
405+
ModelDefaultsConfig,
406+
PlanConfig,
407+
)
408+
409+
config = Config(
410+
model_defaults=ModelDefaultsConfig(dialect=<dialect>),
411+
plan=PlanConfig(
412+
always_compare_against_prod=True,
413+
),
414+
)
415+
```
416+
417+
#### Change Categorization Example
418+
419+
Consider this scenario with `always_recreate_environment` enabled:
420+
421+
1. Initial state in `prod`:
422+
```sql
423+
MODEL (name sqlmesh_example.test_model, kind FULL);
424+
SELECT 1 AS col
425+
```
426+
427+
1. First (breaking) change in `dev`:
428+
```sql
429+
MODEL (name sqlmesh_example__dev.test_model, kind FULL);
430+
SELECT 2 AS col
431+
```
432+
433+
??? "Output plan example #1"
434+
435+
```bash
436+
New environment `dev` will be created from `prod`
437+
438+
Differences from the `prod` environment:
439+
440+
Models:
441+
└── Directly Modified:
442+
└── sqlmesh_example__dev.test_model
443+
444+
---
445+
+++
446+
447+
448+
kind FULL
449+
)
450+
SELECT
451+
- 1 AS col
452+
+ 2 AS col
453+
```
454+
455+
3. Second (metadata) change in `dev`:
456+
```sql
457+
MODEL (name sqlmesh_example__dev.test_model, kind FULL, owner 'John Doe');
458+
SELECT 5 AS col
459+
```
460+
461+
??? "Output plan example #2"
462+
463+
```bash
464+
New environment `dev` will be created from `prod`
465+
466+
Differences from the `prod` environment:
467+
468+
Models:
469+
└── Directly Modified:
470+
└── sqlmesh_example__dev.test_model
471+
472+
---
473+
474+
+++
475+
476+
@@ -1,8 +1,9 @@
477+
478+
MODEL (
479+
name sqlmesh_example.test_model,
480+
+ owner "John Doe",
481+
kind FULL
482+
)
483+
SELECT
484+
- 1 AS col
485+
+ 2 AS col
486+
487+
Directly Modified: sqlmesh_example__dev.test_model (Breaking)
488+
Models needing backfill:
489+
└── sqlmesh_example__dev.test_model: [full refresh]
490+
```
491+
492+
Even though the second change should have been a metadata change (thus not requiring a backfill), it will still be classified as a breaking change because the comparison is against production instead of the previous development state. This is intentional and may cause additional backfills as more changes are accumulated.
493+
494+
384495
### Gateways
385496
386497
The `gateways` configuration defines how SQLMesh should connect to the data warehouse, state backend, and scheduler. These options are in the [gateway](../reference/configuration.md#gateway) section of the configuration reference page.

docs/reference/configuration.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -80,7 +80,7 @@ Configuration for the `sqlmesh plan` command.
8080
| `enable_preview` | Indicates whether to enable [data preview](../concepts/plans.md#data-preview) for forward-only models when targeting a development environment (Default: True, except for dbt projects where the target engine does not support cloning) | Boolean | N |
8181
| `no_diff` | Don't show diffs for changed models (Default: False) | boolean | N |
8282
| `no_prompts` | Disables interactive prompts in CLI (Default: True) | boolean | N |
83-
83+
| `always_recreate_environment` | Always recreates the target environment from the environment specified in `create_from` (by default `prod`) (Default: False) | boolean | N |
8484
## Run
8585

8686
Configuration for the `sqlmesh run` command. Please note that this is only applicable when configured with the [builtin](#builtin) scheduler.

sqlmesh/core/config/plan.py

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -20,6 +20,7 @@ class PlanConfig(BaseConfig):
2020
auto_apply: Whether to automatically apply the new plan after creation.
2121
use_finalized_state: Whether to compare against the latest finalized environment state, or to use
2222
whatever state the target environment is currently in.
23+
always_recreate_environment: Whether to always recreate the target environment from the `create_from` environment.
2324
"""
2425

2526
forward_only: bool = False
@@ -30,3 +31,4 @@ class PlanConfig(BaseConfig):
3031
no_prompts: bool = True
3132
auto_apply: bool = False
3233
use_finalized_state: bool = False
34+
always_recreate_environment: bool = False

sqlmesh/core/context.py

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1487,6 +1487,7 @@ def plan_builder(
14871487
or (backfill_models is not None and not backfill_models),
14881488
ensure_finalized_snapshots=self.config.plan.use_finalized_state,
14891489
diff_rendered=diff_rendered,
1490+
always_recreate_environment=self.config.plan.always_recreate_environment,
14901491
)
14911492
modified_model_names = {
14921493
*context_diff.modified_snapshots,
@@ -2628,6 +2629,7 @@ def _context_diff(
26282629
force_no_diff: bool = False,
26292630
ensure_finalized_snapshots: bool = False,
26302631
diff_rendered: bool = False,
2632+
always_recreate_environment: bool = False,
26312633
) -> ContextDiff:
26322634
environment = Environment.sanitize_name(environment)
26332635
if force_no_diff:
@@ -2645,6 +2647,7 @@ def _context_diff(
26452647
environment_statements=self._environment_statements,
26462648
gateway_managed_virtual_layer=self.config.gateway_managed_virtual_layer,
26472649
infer_python_dependencies=self.config.infer_python_dependencies,
2650+
always_recreate_environment=always_recreate_environment,
26482651
)
26492652

26502653
def _destroy(self) -> None:

sqlmesh/core/context_diff.py

Lines changed: 13 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -103,6 +103,7 @@ def create(
103103
environment_statements: t.Optional[t.List[EnvironmentStatements]] = [],
104104
gateway_managed_virtual_layer: bool = False,
105105
infer_python_dependencies: bool = True,
106+
always_recreate_environment: bool = False,
106107
) -> ContextDiff:
107108
"""Create a ContextDiff object.
108109
@@ -128,10 +129,12 @@ def create(
128129
The ContextDiff object.
129130
"""
130131
environment = environment.lower()
131-
env = state_reader.get_environment(environment)
132-
132+
existing_env = state_reader.get_environment(environment)
133133
create_from_env_exists = False
134-
if env is None or env.expired:
134+
135+
recreate_environment = always_recreate_environment and not environment == create_from
136+
137+
if existing_env is None or existing_env.expired or recreate_environment:
135138
env = state_reader.get_environment(create_from.lower())
136139

137140
if not env and create_from != c.PROD:
@@ -143,6 +146,7 @@ def create(
143146
create_from_env_exists = env is not None
144147
previously_promoted_snapshot_ids = set()
145148
else:
149+
env = existing_env
146150
is_new_environment = False
147151
previously_promoted_snapshot_ids = {s.snapshot_id for s in env.promoted_snapshots}
148152

@@ -220,6 +224,11 @@ def create(
220224

221225
previous_environment_statements = state_reader.get_environment_statements(environment)
222226

227+
if existing_env and always_recreate_environment:
228+
previous_plan_id: t.Optional[str] = existing_env.plan_id
229+
else:
230+
previous_plan_id = env.plan_id if env and not is_new_environment else None
231+
223232
return ContextDiff(
224233
environment=environment,
225234
is_new_environment=is_new_environment,
@@ -232,7 +241,7 @@ def create(
232241
modified_snapshots=modified_snapshots,
233242
snapshots=merged_snapshots,
234243
new_snapshots=new_snapshots,
235-
previous_plan_id=env.plan_id if env and not is_new_environment else None,
244+
previous_plan_id=previous_plan_id,
236245
previously_promoted_snapshot_ids=previously_promoted_snapshot_ids,
237246
previous_finalized_snapshots=env.previous_finalized_snapshots if env else None,
238247
previous_requirements=env.requirements if env else {},

tests/core/test_integration.py

Lines changed: 102 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -6,7 +6,7 @@
66
from datetime import timedelta
77
from unittest import mock
88
from unittest.mock import patch
9-
9+
import logging
1010
import os
1111
import numpy as np # noqa: TID253
1212
import pandas as pd # noqa: TID253
@@ -37,6 +37,7 @@
3737
from sqlmesh.core.console import Console, get_console
3838
from sqlmesh.core.context import Context
3939
from sqlmesh.core.config.categorizer import CategorizerConfig
40+
from sqlmesh.core.config.plan import PlanConfig
4041
from sqlmesh.core.engine_adapter import EngineAdapter
4142
from sqlmesh.core.environment import EnvironmentNamingInfo
4243
from sqlmesh.core.macros import macro
@@ -6252,3 +6253,103 @@ def test_render_path_instead_of_model(tmp_path: Path):
62526253

62536254
# Case 3: Render the model successfully
62546255
assert ctx.render("test_model").sql() == 'SELECT 1 AS "col"'
6256+
6257+
6258+
@use_terminal_console
6259+
def test_plan_always_recreate_environment(tmp_path: Path):
6260+
def plan_with_output(ctx: Context, environment: str):
6261+
with patch.object(logger, "info") as mock_logger:
6262+
with capture_output() as output:
6263+
ctx.load()
6264+
ctx.plan(environment, no_prompts=True, auto_apply=True)
6265+
6266+
# Facade logs info "Promoting environment {environment}"
6267+
assert mock_logger.call_args[0][1] == environment
6268+
6269+
return output
6270+
6271+
models_dir = tmp_path / "models"
6272+
6273+
logger = logging.getLogger("sqlmesh.core.state_sync.db.facade")
6274+
6275+
create_temp_file(
6276+
tmp_path, models_dir / "a.sql", "MODEL (name test.a, kind FULL); SELECT 1 AS col"
6277+
)
6278+
6279+
config = Config(plan=PlanConfig(always_recreate_environment=True))
6280+
ctx = Context(paths=[tmp_path], config=config)
6281+
6282+
# Case 1: Neither prod nor dev exists, so dev is initialized
6283+
output = plan_with_output(ctx, "dev")
6284+
6285+
assert """`dev` environment will be initialized""" in output.stdout
6286+
6287+
# Case 2: Prod does not exist, so dev is updated
6288+
create_temp_file(
6289+
tmp_path, models_dir / "a.sql", "MODEL (name test.a, kind FULL); SELECT 5 AS col"
6290+
)
6291+
6292+
output = plan_with_output(ctx, "dev")
6293+
assert "`dev` environment will be initialized" in output.stdout
6294+
6295+
# Case 3: Prod is initialized, so plan comparisons moving forward should be against prod
6296+
output = plan_with_output(ctx, "prod")
6297+
assert "`prod` environment will be initialized" in output.stdout
6298+
6299+
# Case 4: Dev is updated with a breaking change. Prod exists now so plan comparisons moving forward should be against prod
6300+
create_temp_file(
6301+
tmp_path, models_dir / "a.sql", "MODEL (name test.a, kind FULL); SELECT 10 AS col"
6302+
)
6303+
ctx.load()
6304+
6305+
plan = ctx.plan_builder("dev").build()
6306+
6307+
assert (
6308+
next(iter(plan.context_diff.snapshots.values())).change_category
6309+
== SnapshotChangeCategory.BREAKING
6310+
)
6311+
6312+
output = plan_with_output(ctx, "dev")
6313+
assert "New environment `dev` will be created from `prod`" in output.stdout
6314+
assert "Differences from the `prod` environment" in output.stdout
6315+
6316+
# Case 5: Dev is updated with a metadata change, but comparison against prod shows both the previous and the current changes
6317+
# so it's still classified as a breaking change
6318+
create_temp_file(
6319+
tmp_path,
6320+
models_dir / "a.sql",
6321+
"MODEL (name test.a, kind FULL, owner 'test'); SELECT 10 AS col",
6322+
)
6323+
ctx.load()
6324+
6325+
plan = ctx.plan_builder("dev").build()
6326+
6327+
assert (
6328+
next(iter(plan.context_diff.snapshots.values())).change_category
6329+
== SnapshotChangeCategory.BREAKING
6330+
)
6331+
6332+
output = plan_with_output(ctx, "dev")
6333+
assert "New environment `dev` will be created from `prod`" in output.stdout
6334+
assert "Differences from the `prod` environment" in output.stdout
6335+
6336+
assert (
6337+
"""MODEL (
6338+
name test.a,
6339+
+ owner test,
6340+
kind FULL
6341+
)
6342+
SELECT
6343+
- 5 AS col
6344+
+ 10 AS col"""
6345+
in output.stdout
6346+
)
6347+
6348+
# Case 6: Ensure that target environment and create_from environment are not the same
6349+
output = plan_with_output(ctx, "prod")
6350+
assert not "New environment `prod` will be created from `prod`" in output.stdout
6351+
6352+
# Case 7: Check that we can still run Context::diff() against any environment
6353+
for environment in ["dev", "prod"]:
6354+
context_diff = ctx._context_diff(environment)
6355+
assert context_diff.environment == environment

0 commit comments

Comments
 (0)