Skip to content

Rollup Lambda wrong sql query when using timezone #9638

@ernriveron

Description

@ernriveron

Describe the bug
Incorrect SQL query in Rollup Lambda when using time zone. The Lambda query range is calculated incorrectly.

To Reproduce

Enviroment

CUBEJS_SCHEDULED_REFRESH_TIMEZONES=Europe/Madrid

Test Cube

cube(`Test`, {
    dataSource: `Test`,
    sql: `
        select '2025-01-15T10:23:45.000Z'::timestamp(3) as "createdAt", '1' as id
        union all
        select '2025-02-05T14:12:30.000Z'::timestamp(3) as "createdAt", '2' as id
        union all         
        select '2025-02-25T08:45:10.000Z'::timestamp(3) as "createdAt", '3' as id
        union all
        select '2025-03-10T11:34:56.000Z'::timestamp(3) as "createdAt", '4' as id
        union all
        select '2025-03-25T19:05:15.000Z'::timestamp(3) as "createdAt", '5' as id
        union all
        select '2025-04-01T06:50:00.000Z'::timestamp(3) as "createdAt", '6' as id
        union all
        select '2025-04-15T13:20:40.000Z'::timestamp(3) as "createdAt", '7' as id
        union all
        select '2025-05-05T17:45:30.000Z'::timestamp(3) as "createdAt", '8' as id
        union all
        select '2025-05-20T09:10:20.000Z'::timestamp(3) as "createdAt", '9' as id
        union all
        select '2025-05-30T23:59:59.000Z'::timestamp(3) as "createdAt", '10' as id        
    `,
    measures: {
        count: {
            type: 'count'
        }
    },
    dimensions: {
        createdAt: {
            sql: `${CUBE}."createdAt"`,
            type: `time`
        },
        id: {
            sql: `${CUBE}.id`,
            type: `string`
        },
    },
    preAggregations: {
        lambda: { // Speed layer: For data in the last 24h.
            type: `rollup_lambda`,
            union_with_source_data: true,
            rollups: [CUBE.main],
        },
        main: { // Batch layer: For data from 2021-01-01 until 24h ago.
            measures: [
                CUBE.count
            ],
            dimensions: [
                CUBE.id,
                CUBE.createdAt,
            ],
            timeDimension: CUBE.createdAt,
            granularity: `hour`,
            partitionGranularity: `month`,
            refreshKey: {
                every: `55 minutes`,
            },
            /*build_range_end: {
                sql: `SELECT NOW() - INTERVAL '24 hour'`,
            },*/
        }
    }
});

Generated sql query to build last partition

Build range end is not localized

   SELECT
       "test"."createdAt" "test__created_at", 
       "test".id "test__id", 
       date_trunc('hour', ("test"."createdAt"::timestamptz AT TIME ZONE 'Europe/Madrid')) "test__created_at_hour", 
       count(*) "test__count"
     FROM
       (
         select '2025-01-15T10:23:45.000Z'::timestamp(3) as "createdAt", '1' as id
         union all
         select '2025-02-05T14:12:30.000Z'::timestamp(3) as "createdAt", '2' as id
         union all
         select '2025-02-25T08:45:10.000Z'::timestamp(3) as "createdAt", '3' as id
         union all
         select '2025-03-10T11:34:56.000Z'::timestamp(3) as "createdAt", '4' as id
         union all
         select '2025-03-25T19:05:15.000Z'::timestamp(3) as "createdAt", '5' as id
         union all
         select '2025-04-01T06:50:00.000Z'::timestamp(3) as "createdAt", '6' as id
         union all
         select '2025-04-15T13:20:40.000Z'::timestamp(3) as "createdAt", '7' as id
         union all
         select '2025-05-05T17:45:30.000Z'::timestamp(3) as "createdAt", '8' as id
         union all
         select '2025-05-20T09:10:20.000Z'::timestamp(3) as "createdAt", '9' as id
         union all
         select '2025-05-30T23:59:59.000Z'::timestamp(3) as "createdAt", '10' as id
     ) AS "test"  WHERE ("test"."createdAt" >= '2025-04-30T22:00:00.000Z'::timestamptz AND "test"."createdAt" <= '2025-05-30T23:59:59.000Z'::timestamptz) GROUP BY 1, 2, 3

Generated Lambda sql query

{
  "timezone": "Europe/Madrid",
  "measures": [
    "Test.count"
  ]
}
SELECT
    "test"."createdAt" "test__created_at",
    "test".id "test__id",
    date_trunc(
        'hour', (
            "test"."createdAt":: timestamptz AT TIME ZONE 'Europe/Madrid'
        )
    ) "test__created_at_hour",
    count(*) "test__count"
FROM (
        select
            '2025-01-15T10:23:45.000Z':: timestamp(3) as "createdAt",
            '1' as id
        union all
        select
            '2025-02-05T14:12:30.000Z':: timestamp(3) as "createdAt",
            '2' as id
        union all
        select
            '2025-02-25T08:45:10.000Z':: timestamp(3) as "createdAt",
            '3' as id
        union all
        select
            '2025-03-10T11:34:56.000Z':: timestamp(3) as "createdAt",
            '4' as id
        union all
        select
            '2025-03-25T19:05:15.000Z':: timestamp(3) as "createdAt",
            '5' as id
        union all
        select
            '2025-04-01T06:50:00.000Z':: timestamp(3) as "createdAt",
            '6' as id
        union all
        select
            '2025-04-15T13:20:40.000Z':: timestamp(3) as "createdAt",
            '7' as id
        union all
        select
            '2025-05-05T17:45:30.000Z':: timestamp(3) as "createdAt",
            '8' as id
        union all
        select
            '2025-05-20T09:10:20.000Z':: timestamp(3) as "createdAt",
            '9' as id
        union all
        select
            '2025-05-30T23:59:59.000Z':: timestamp(3) as "createdAt",
            '10' as id
    ) AS "test"
WHERE (
        "test"."createdAt" > '2025-05-31T01:59:59.000Z':: timestamptz
    )
GROUP BY 1, 2, 3
LIMIT 200000

In a test cube we are not able to reproduce the availability of new inserted data but as we can see the end of build range for pre-aggregation was '2025-05-30T23:59:59.000Z' and the start of generated lambda query was '2025-05-31T01:59:59.000Z', so the end of build range is the maximum non-localized value for the time dimension and the start of lambda query is the maximum localized value for the time dimension resulting in missing records between '2025-05-30T23:59:59.000Z' and '2025-05-31T01:59:59.000Z'.

Version:
since 1.2.18

Additional context
Postgres as data source

Metadata

Metadata

Assignees

Labels

pre-aggregationsIssues related to pre-aggregations

Type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions