Skip to content

refactor: Integrate the materialized CTE into the plan and pipeline #18226

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 54 commits into
base: main
Choose a base branch
from

Conversation

SkyFan2002
Copy link
Member

@SkyFan2002 SkyFan2002 commented Jun 23, 2025

I hereby agree to the terms of the CLA available at: https://docs.databend.com/dev/policies/cla/

Summary

This PR improves the execution of materialized CTEs. Previously, a temporary table was created in the bind phase. This had several drawbacks:

  1. the CTE and main query couldn't be jointly optimized by the optimizer
  2. the information from "explain" and "profile" is inaccurate
  3. automatic memory spilling of materialized CTE at the query level wasn't possible.

The proposed changes allow for better integration of materialized CTEs into the planning and optimization stages. Note that this PR does not implement distributed execution of materialized CTEs, which is planned to be improved in subsequent PRs

Example

explain with t1 as materialized (select number as a from numbers(10)), t2 as materialized (select a as b from t1) select t1.a from t1 join t2 on t1.a = t2.b;
----
MaterializedCTE: t1
├── TableScan
│   ├── table: default.system.numbers
│   ├── output columns: [number (#3)]
│   ├── read rows: 10
│   ├── read size: < 1 KiB
│   ├── partitions total: 1
│   ├── partitions scanned: 1
│   ├── push downs: [filters: [], limit: NONE]
│   └── estimated rows: 10.00
└── MaterializedCTE: t2
    ├── CTEConsumer
    │   ├── cte_name: t1
    │   └── cte_schema: [number (#2)]
    └── HashJoin
        ├── output columns: [numbers.number (#0)]
        ├── join type: INNER
        ├── build keys: [t2.b (#1)]
        ├── probe keys: [t1.a (#0)]
        ├── keys is null equal: [false]
        ├── filters: []
        ├── build join filters:
        │   └── filter id:0, build key:t2.b (#1), probe key:t1.a (#0), filter type:bloom,inlist,min_max
        ├── estimated rows: 0.00
        ├── CTEConsumer(Build)
        │   ├── cte_name: t2
        │   └── cte_schema: [number (#1)]
        └── CTEConsumer(Probe)
            ├── cte_name: t1
            └── cte_schema: [number (#0)]

Tests

  • Unit Test
  • Logic Test
  • Benchmark Test
  • No Test - Explain why

Type of change

  • Bug Fix (non-breaking change which fixes an issue)
  • New Feature (non-breaking change which adds functionality)
  • Breaking Change (fix or feature that could cause existing functionality not to work as expected)
  • Documentation Update
  • Refactoring
  • Performance Improvement
  • Other (please describe):

This change is Reviewable

@github-actions github-actions bot added the pr-refactor this PR changes the code base without new features or bugfix label Jun 23, 2025
@databendlabs databendlabs deleted a comment from github-actions bot Jun 30, 2025
@databendlabs databendlabs deleted a comment from github-actions bot Jun 30, 2025
@databendlabs databendlabs deleted a comment from github-actions bot Jun 30, 2025
@databendlabs databendlabs deleted a comment from github-actions bot Jun 30, 2025
@databendlabs databendlabs deleted a comment from github-actions bot Jun 30, 2025
@databendlabs databendlabs deleted a comment from github-actions bot Jun 30, 2025
@databendlabs databendlabs deleted a comment from github-actions bot Jun 30, 2025
@databendlabs databendlabs deleted a comment from github-actions bot Jun 30, 2025
@databendlabs databendlabs deleted a comment from github-actions bot Jul 10, 2025
@databendlabs databendlabs deleted a comment from github-actions bot Jul 10, 2025
@databendlabs databendlabs deleted a comment from github-actions bot Jul 10, 2025
@databendlabs databendlabs deleted a comment from github-actions bot Jul 17, 2025
@SkyFan2002 SkyFan2002 marked this pull request as ready for review July 18, 2025 08:03
@SkyFan2002 SkyFan2002 requested review from zhang2014 and sundy-li July 18, 2025 08:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
pr-refactor this PR changes the code base without new features or bugfix
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant