Skip to content

geotrellissentinelhub.PyramidFactory does not filter spatial keys that are out of bounds #417

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
JeroenVerstraelen opened this issue Apr 11, 2025 · 3 comments · May be fixed by #418
Open
Assignees

Comments

@JeroenVerstraelen
Copy link
Contributor

JeroenVerstraelen commented Apr 11, 2025

Description

In org.openeo.geotrellissentinelhub.PyramidFactory SpatialKeys that go outside of the layout definition bounds are not filtered out:

var requiredKeysRdd: RDD[SpaceTimeKey] = requiredSpatialKeysForFeatures.map { case (SpatialKey(col, row), Feature(_, date)) => SpaceTimeKey(col, row, date)}.filter(k=> k.col>=0&&k.row>=0)

Should become:

var requiredKeysRdd: RDD[SpaceTimeKey] = requiredSpatialKeysForFeatures.map { case (SpatialKey(col, row), Feature(_, date)) => SpaceTimeKey(col, row, date)}.filter(k=> k.col>=0&&k.row>=0&&k.col<layout.tileLayout.layoutCols&&k.row<layout.tileLayout.layoutRows)

Minimal PG:

https://gist.github.com/JeroenVerstraelen/0971bffc8936b604dd67d3194a3b276e
Here, the layout definition is 2x1 but SpatialKey(0,1) and SpatialKey(1,1) are included in the requiredKeys. This causes an error in partitioners later down the line that is difficult to understand for users.

OpenEO batch job failed: Exception during Spark execution: java.lang.ArrayIndexOutOfBoundsException: Index 2 out of bounds for length 2
@jdries
Copy link
Contributor

jdries commented Apr 16, 2025

@JeroenVerstraelen do we understand why those keys are there in the first place?
Is this the issue that evoland user reported? Do we need to get this through review sufficiently fast?

@JeroenVerstraelen
Copy link
Contributor Author

JeroenVerstraelen commented Apr 18, 2025

@jdries It looks like the multiPolygonBuffered variable causes this.
In the screenshot you can see that the layout is constructed using the original multipolygons (passed to datacube_seq as 'polygons'). However, the requiredSpatialKeysForFeatures are constructed using multiPolygonBuffered. This buffered polygon goes outside of the layoutdefinition's extent so it creates both negative and too large spatial keys.

Image

In green you can see the spatial keys it creates:
Image

@JeroenVerstraelen
Copy link
Contributor Author

@jdries is it okay to merge this PR and close this issue?
#418

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants