Skip to content

The child table column remains in the schema as a partial column with seen-null-first=True #3125

@anuunchin

Description

@anuunchin

dlt version

Latest

Describe the problem

When a column has only None values initially, it is saved in the schema as a partial column with x-normalizer.seen-null-first set to True. In the end, when data arrives as a nested structure in this column, a child table is created, but the partial column schema that was created is not removed from the schema.

Expected behavior

The column schema should be removed if it was created as a child table.

Steps to reproduce

This test should pass:

def test_empty_column_later_becoming_child_table_removed() -> None:
    name = "schema_test" + uniq_id()
    p = dlt.pipeline(
        pipeline_name=name,
        destination=dummy(completed_prob=1),
        export_schema_path=EXPORT_SCHEMA_PATH,
    )
    test = p.default_schema.naming.max_length
    @dlt.resource(table_name="my_table")
    def nested_data():
        nested_example_data = EXAMPLE_DATA[0]
        nested_example_data["children"] = None
        yield nested_example_data
    p.run(nested_data())
    @dlt.resource(table_name="my_table")
    def nested_data():
        nested_example_data = EXAMPLE_DATA[0]
        nested_example_data["children"] = [{"id": 2, "name": "Max"}, {"id": 3, "name": "Julia"}]
        yield nested_example_data
    p.run(nested_data())
    export_schema = _get_export_schema(name)
    assert "children" not in export_schema.tables["my_table"]["columns"]
    assert "my_table__children" in export_schema.tables
    assert "children" not in p.default_schema.tables["my_table"]["columns"]
    assert "my_table__children" in p.default_schema.tables

Operating system

macOS

Runtime environment

Local

Python version

3.10

dlt data source

Affects all sources

dlt destination

DuckDB

Other deployment details

No response

Additional information

No response

Metadata

Metadata

Assignees

Labels

bugSomething isn't working

Type

No type

Projects

Status

In Progress

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions