Skip to content

Conversation

Copilot
Copy link
Contributor

@Copilot Copilot AI commented Aug 1, 2025

When a column serves as both a primary key and a foreign key, dm_flatten_to_tbl(.recursive = TRUE) fails with "Join columns in x must be present in the data" error. This occurs because the sequential join logic doesn't properly handle column disambiguation in recursive scenarios.

Problem

Consider this data model where y.c is both a primary key and foreign key:

library(dm)

x <- tibble(a = 1, b = 2)
y <- tibble(c = 2, d = 3)  
z <- tibble(e = 2, f = 4)

mydm <- dm(x, y, z) %>%
  dm_add_pk(x, a) %>%
  dm_add_pk(y, c) %>%
  dm_add_pk(z, e) %>%
  dm_add_fk(x, b, y) %>%
  dm_add_fk(y, c, z)  # y.c is both PK and FK

mydm %>% dm_flatten_to_tbl(x, .recursive = TRUE)
#> Error in `join()`:
#> ! Join columns in `x` must be present in the data.
#> ✖ Problem with `c`.

The issue occurs during the join sequence:

  1. x LEFT JOIN y ON x.b = y.c → result has columns (a, b, c, d)
  2. result LEFT JOIN z ON result.c = z.e → but if c was renamed to c.y during disambiguation, the join fails

Solution

Modified the recursive flattening logic in dm_flatten_to_tbl_impl() to:

  1. Use iterative joins instead of reduce2() for recursive flattening
  2. Check column availability before each join operation
  3. Handle disambiguation by detecting when join columns have been renamed (e.g., cc.y) and mapping to the correct disambiguated names
  4. Preserve original behavior for non-recursive flattening

The fix includes robust pattern matching to find disambiguated column names and graceful fallback when mapping cannot be determined.

Testing

Added a comprehensive test case that reproduces the exact scenario from the issue, ensuring that recursive flattening works correctly when columns serve dual PK/FK roles.

Fixes #2234.

Warning

Firewall rules blocked me from connecting to one or more addresses

I tried to connect to the following addresses, but was blocked by firewall rules:

  • cloud.r-project.org
    • Triggering command: /usr/lib/R/bin/exec/R -e install.packages(c(&#39;tibble&#39;,~&#43;~&#39;dplyr&#39;,~&#43;~&#39;testthat&#39;,~&#43;~&#39;igraph&#39;,~&#43;~&#39;cli&#39;,~&#43;~&#39;rlang&#39;,~&#43;~&#39;lifecycle&#39;,~&#43;~&#39;vctrs&#39;,~&#43;~&#39;pillar&#39;,~&#43;~&#39;glue&#39;),~&#43;~repos=&#39;REDACTED&#39;) (dns block)
  • cran.rstudio.com
    • Triggering command: /usr/lib/R/bin/exec/R -e install.packages(c(&#39;tibble&#39;,~&#43;~&#39;dplyr&#39;,~&#43;~&#39;testthat&#39;,~&#43;~&#39;igraph&#39;,~&#43;~&#39;cli&#39;,~&#43;~&#39;rlang&#39;,~&#43;~&#39;lifecycle&#39;,~&#43;~&#39;vctrs&#39;,~&#43;~&#39;pillar&#39;,~&#43;~&#39;glue&#39;),~&#43;~repos=&#39;REDACTED&#39;) (dns block)
  • esm.ubuntu.com
    • Triggering command: /usr/lib/apt/methods/https (dns block)

If you need me to access, download, or install something from one of these locations, you can either:


💡 You can make Copilot smarter by setting up custom instructions, customizing its development environment and configuring Model Context Protocol (MCP) servers. Learn more Copilot coding agent tips in the docs.

Copilot AI and others added 2 commits August 1, 2025 05:32
…ign key

Co-authored-by: krlmlr <1741643+krlmlr@users.noreply.github.com>
Co-authored-by: krlmlr <1741643+krlmlr@users.noreply.github.com>
@Copilot Copilot AI changed the title [WIP] Recursive flattening failure with combined primary and foreign key Fix recursive flattening failure with combined primary and foreign key Aug 1, 2025
@Copilot Copilot AI requested a review from krlmlr August 1, 2025 05:36
Copilot finished work on behalf of krlmlr August 1, 2025 05:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

Recursive flattening failure with combined primary and foreign key
2 participants