Skip to content

Conversation

ttusing
Copy link

@ttusing ttusing commented Apr 29, 2025

resolves #1017

Problem

See #1017 - passing both include and column_override causes column_override to be case sensitive and silently fail/default to using first relation's datatypes.

AI summary from Claude:

Looking at the code, I can explain why the case sensitivity issue occurs with the include parameter. Here's what's happening in the original code:

1. When columns are processed from each relation, there's a check for inclusion that uses lowercase comparison:
{%- elif include and col.column | lower not in all_includes -%}

2. The column names that pass this check are stored in column_superset using their original case:
{%- do column_superset.update({col.column: col}) -%}

3. Later, when applying the column_override, the lookup is done using the original case of the column name from column_superset, but the column_override dictionary is used as-is without any case conversion.

This creates a mismatch because:

* The include filter converts column names to lowercase for comparison
* But the resulting column_superset maintains the original case of the columns
* Then the column_override lookup fails because it's trying to match the original case against whatever case was provided in the column_override dictionary
* This is why when using include, the column_override needs to match the original case of the columns in the source tables - because the case preservation in column_superset creates a case-sensitive context for the column_override lookup.

Solution

Continuing AI generated output:

Our fix addresses this by converting both the column names and types in column_override to lowercase at the beginning:

{%- set lowercase_column_override = {} -%}
{%- if column_override is not none -%}
    {%- for col, type in column_override.items() -%}
        {%- do lowercase_column_override.update({col | lower: type | lower}) -%}
    {%- endfor -%}
{%- endif -%}

And then using lowercase comparison when looking up the override:

{%- set col_type = lowercase_column_override.get(col.column | lower, col.data_type) %}

This makes the behavior consistent regardless of whether include is used or not. The bug report should emphasize that this is a consistency fix that makes the column_override behavior case-insensitive in all scenarios, matching the case-insensitive behavior that already exists for the include and exclude parameters.

I tested this behavior locally and it works.

Checklist

  • This code is associated with an issue which has been triaged and accepted for development.
  • I have read the contributing guide and understand what's expected of me
  • I have run this code in development and it appears to resolve the stated issue
  • [] This PR includes tests, or tests are not required/relevant for this PR
  • [] I have updated the README.md (if applicable)

@ttusing ttusing requested a review from a team as a code owner April 29, 2025 23:12
@ttusing
Copy link
Author

ttusing commented Apr 30, 2025

@ryramshaka
Copy link

Can someone approve I need this over 'id' and 'Id' from source tables 😭

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

union_relations: passing include parameter makes column_override fields case sensitive in data types quietly

2 participants