-
-
Notifications
You must be signed in to change notification settings - Fork 290
Open
Labels
OGC API - FeaturesOGC API - FeaturesOGC API - FeaturesbugSomething isn't workingSomething isn't working
Description
Description
The Postgres provider currently performs a full .count() on every request to populate numberMatched
, which significantly slows down queries on large tables. This makes API responses much slower than necessary, especially when querying large datasets for a small subset of the features.
Steps to Reproduce
- Set up pygeoapi with a large Postgres table as a data source.
- Make a request to an endpoint that queries the table.
- Observe that the response time is impacted by the .count() operation.
Expected behavior
Queries should return results faster by avoiding expensive .count() operations on large tables.
Potential workarounds are:
- Introduce an environment variable to control whether the .count() is always run.
- Implement an upper limit on .count() queries, with a fallback mechanism such as TABLESAMPLE or estimated counts from Postgres' pg_stat tables.
- Use an approximate count method when exact counts are not required. This is implemented in https://github.com/cgs-earth/pygeoapi-plugins/blob/master/pygeoapi_plugins/provider/postgresql.py
Screenshots/Tracebacks
If applicable, add screenshots to help explain your problem.
Environment
- OS: Mac
- Python version: 3.10
- pygeoapi version: 0.20.dev0
Additional context
Similar constraints exist in other providers, with some only doing a count of features when resulttype=hits
vprivat-ads, boxerab and david-i-berry
Metadata
Metadata
Assignees
Labels
OGC API - FeaturesOGC API - FeaturesOGC API - FeaturesbugSomething isn't workingSomething isn't working