Skip to content

Commit 807ac20

Browse files
frankinspaceFrank Greguska
andauthored
Add regression test for net2cog (#94)
* Add regression test for net2cog * Add regression test for net2cog * updates after review --------- Co-authored-by: Frank Greguska <Francis.Greguska@jpl.nasa.gov>
1 parent 2acef0b commit 807ac20

File tree

10 files changed

+362
-3
lines changed

10 files changed

+362
-3
lines changed

.github/workflows/build-all-images.yml

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -45,6 +45,9 @@ jobs:
4545
- image: "geoloco"
4646
notebook: "Geoloco_Regression.ipynb"
4747

48+
- image: "net2cog"
49+
notebook: "net2cog_Regression.ipynb"
50+
4851
uses: ./.github/workflows/build-target-image.yml
4952
with:
5053
image-short-name: ${{ matrix.targets.image }}

CHANGELOG.md

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -5,6 +5,9 @@ versioning. Rather than a static releases, this repository contains of a number
55
of regression tests that are each semi-independent. This CHANGELOG file should be used
66
to document pull requests to this repository.
77

8+
## 2024-08-30
9+
10+
Add regression test for net2cog
811

912
## 2024-08-05 ([#86](https://github.com/nasa/harmony-regression-tests/pull/86))
1013

script/test-in-bamboo.sh

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -42,7 +42,7 @@ echo "harmony host url: ${harmony_host_url}"
4242
## e.g. if REGRESSION_TESTS_N2Z_IMAGE environment was set, the value would be used instead of the default.
4343

4444
image_names=()
45-
all_tests=(harmony harmony-regression hoss hga n2z swath-projector trajectory-subsetter variable-subsetter regridder hybig geoloco)
45+
all_tests=(harmony harmony-regression hoss hga n2z swath-projector trajectory-subsetter variable-subsetter regridder hybig geoloco net2cog)
4646
for image in "${all_tests[@]}"; do
4747
image_names+=($(image_name "$image" true))
4848
done

test/Makefile

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -31,4 +31,7 @@ variable-subsetter-image: Dockerfile variable-subsetter/environment.yaml
3131
geoloco-image: Dockerfile geoloco/environment.yaml
3232
docker build -t ghcr.io/nasa/regression-tests-geoloco:latest -f ./Dockerfile --build-arg notebook=Geoloco_Regression.ipynb --build-arg sub_dir=geoloco .
3333

34-
images: harmony-image harmony-regression-image hga-image hoss-image hybig-image n2z-image regridder-image swath-projector-image trajectory-subsetter-image variable-subsetter-image geoloco-image
34+
net2cog-image: Dockerfile net2cog/environment.yaml
35+
docker build -t ghcr.io/nasa/regression-tests-net2cog:latest -f ./Dockerfile --build-arg notebook=net2cog_Regression.ipynb --build-arg sub_dir=net2cog .
36+
37+
images: harmony-image harmony-regression-image hga-image hoss-image hybig-image n2z-image regridder-image swath-projector-image trajectory-subsetter-image variable-subsetter-image geoloco-image net2cog-image

test/net2cog/environment.yaml

Lines changed: 14 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,14 @@
1+
name: papermill-net2cog
2+
channels:
3+
- conda-forge
4+
- nodefaults
5+
dependencies:
6+
- python=3.10
7+
- notebook=6.5.4
8+
- papermill=2.3.4
9+
- rasterio=1.3.7
10+
- rio-cogeo=5.3.4
11+
- numpy=1.24.3
12+
- pip=23.1.2
13+
- pip:
14+
- harmony-py==0.4.8
Lines changed: 288 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,288 @@
1+
{
2+
"cells": [
3+
{
4+
"cell_type": "markdown",
5+
"id": "32af63de",
6+
"metadata": {},
7+
"source": [
8+
"# net2cog regression tests\n",
9+
"\n",
10+
"This Jupyter notebook runs a suite of regression tests against the net2cog Harmony Service.\n",
11+
"\n",
12+
"These tests use [SMAP_RSS_L3_SSS_SMI_8DAY-RUNNINGMEAN_V4](https://cmr.uat.earthdata.nasa.gov/search/concepts/C1234410736-POCLOUD) as netcdf input data to test the net2cog service for the smap_sss variable.\n",
13+
"\n",
14+
"## Set the Harmony environment:\n",
15+
"\n",
16+
"The cell below sets the `harmony_host_url` to one of the following valid values:\n",
17+
"\n",
18+
"* Production: <https://harmony.earthdata.nasa.gov>\n",
19+
"* UAT: <https://harmony.uat.earthdata.nasa.gov>\n",
20+
"* SIT: <https://harmony.sit.earthdata.nasa.gov>\n",
21+
"* Local: <http://localhost:3000>\n",
22+
"\n",
23+
"The default value is for the UAT environment. When using this notebook there are two ways to use the non-default environment:\n",
24+
"\n",
25+
"* Run this notebook in a local Jupyter notebook server and change the value of `harmony_host_url` in the cell below to the value for the environment you require from the above list.\n",
26+
"\n",
27+
"* Use the `run_notebooks.sh` script, which requires you to declare an environment variable `HARMONY_HOST_URL`. Set that environment variable to the value above that corresponds to the environment you want to test. That environment variable will take precedence over the default value in the cell below."
28+
]
29+
},
30+
{
31+
"cell_type": "code",
32+
"execution_count": null,
33+
"id": "dec3bc66",
34+
"metadata": {
35+
"tags": [
36+
"parameters"
37+
]
38+
},
39+
"outputs": [],
40+
"source": [
41+
"harmony_host_url = 'https://harmony.uat.earthdata.nasa.gov'"
42+
]
43+
},
44+
{
45+
"cell_type": "markdown",
46+
"id": "7e969d81",
47+
"metadata": {},
48+
"source": [
49+
"## Prerequisites\n",
50+
"\n",
51+
"The dependencies for this notebook are listed in the [environment.yaml](./environment.yaml). To test or install locally, create the papermill environment used in the automated regression testing suite:\n",
52+
"\n",
53+
"`conda env create -f ./environment.yaml && conda activate papermill-net2cog`\n",
54+
"\n",
55+
"A `.netrc` file must also be located in the `test` directory of this repository."
56+
]
57+
},
58+
{
59+
"cell_type": "markdown",
60+
"id": "802241b5",
61+
"metadata": {},
62+
"source": [
63+
"### Import required packages:"
64+
]
65+
},
66+
{
67+
"cell_type": "code",
68+
"execution_count": null,
69+
"id": "295b8341",
70+
"metadata": {},
71+
"outputs": [],
72+
"source": [
73+
"from pathlib import Path\n",
74+
"from tempfile import TemporaryDirectory\n",
75+
"\n",
76+
"from harmony import BBox, Collection, Environment, Client, Request\n",
77+
"from harmony.harmony import ProcessingFailedException\n",
78+
"from numpy.testing import assert_array_almost_equal\n",
79+
"import rasterio\n",
80+
"from rasterio.transform import Affine\n",
81+
"from rasterio.crs import CRS\n",
82+
"\n",
83+
"import utility\n",
84+
"\n",
85+
"\n",
86+
"reference_dir = Path('./reference_data')"
87+
]
88+
},
89+
{
90+
"cell_type": "markdown",
91+
"id": "9b644811",
92+
"metadata": {},
93+
"source": [
94+
"### Set up environment dependent variables:\n",
95+
"\n",
96+
"This includes the Harmony `Client` object and `Collection` objects for each of the collections for which there are regression tests. The local, SIT and UAT Harmony instances all utilise resources from CMR UAT, meaning any non-production environment will use the same resources.\n",
97+
"\n",
98+
"When adding a production entry to the dictionary below, the collection instances can be included directly in the production dictionary entry, as they do not need to be shared."
99+
]
100+
},
101+
{
102+
"cell_type": "code",
103+
"execution_count": null,
104+
"id": "437af5f8",
105+
"metadata": {},
106+
"outputs": [],
107+
"source": [
108+
"non_production_collection = {\n",
109+
" 'smap_collection': Collection(id='C1234410736-POCLOUD'),\n",
110+
"}\n",
111+
"\n",
112+
"non_prod_granule_data = {\n",
113+
" 'smap_granules': ['G1234601650-POCLOUD'],\n",
114+
"}\n",
115+
"\n",
116+
"collection_data = {\n",
117+
" 'https://harmony.uat.earthdata.nasa.gov': {\n",
118+
" **non_production_collection,\n",
119+
" **non_prod_granule_data,\n",
120+
" 'env': Environment.UAT,\n",
121+
" },\n",
122+
" 'https://harmony.sit.earthdata.nasa.gov': {\n",
123+
" **non_production_collection,\n",
124+
" **non_prod_granule_data,\n",
125+
" 'env': Environment.SIT,\n",
126+
" },\n",
127+
" 'http://localhost:3000': {\n",
128+
" **non_production_collection,\n",
129+
" **non_prod_granule_data,\n",
130+
" 'env': Environment.LOCAL,\n",
131+
" },\n",
132+
"}\n",
133+
"\n",
134+
"environment_information = collection_data.get(harmony_host_url)\n",
135+
"\n",
136+
"if environment_information is not None:\n",
137+
" harmony_client = Client(env=environment_information['env'])\n",
138+
" endpoint_url = environment_information.get('endpoint_url', None)"
139+
]
140+
},
141+
{
142+
"cell_type": "markdown",
143+
"id": "fd8d6cb3",
144+
"metadata": {},
145+
"source": [
146+
"## Test conversion of sss_smap variable\n",
147+
"\n",
148+
"Use SMAP data."
149+
]
150+
},
151+
{
152+
"cell_type": "code",
153+
"execution_count": null,
154+
"id": "cc7d75c5",
155+
"metadata": {},
156+
"outputs": [],
157+
"source": [
158+
"if environment_information is not None:\n",
159+
"\n",
160+
" smap_request = Request(\n",
161+
" collection=environment_information['smap_collection'],\n",
162+
" granule_id=environment_information['smap_granules'][0],\n",
163+
" variables=['sss_smap'],\n",
164+
" max_results=1,\n",
165+
" format='image/tiff',\n",
166+
" )\n",
167+
" print(harmony_client.request_as_curl(smap_request))\n",
168+
"\n",
169+
" smap_job_id = harmony_client.submit(smap_request)\n",
170+
" harmony_client.wait_for_processing(smap_job_id, show_progress=True)"
171+
]
172+
},
173+
{
174+
"cell_type": "code",
175+
"execution_count": null,
176+
"id": "22348be6",
177+
"metadata": {},
178+
"outputs": [],
179+
"source": [
180+
"with TemporaryDirectory() as temp_dir:\n",
181+
" downloaded_cog_outputs = [\n",
182+
" file_future.result()\n",
183+
" for file_future in harmony_client.download_all(\n",
184+
" smap_job_id, overwrite=True, directory=temp_dir\n",
185+
" )\n",
186+
" ]\n",
187+
"\n",
188+
" for cog_file in downloaded_cog_outputs:\n",
189+
" utility.validate_cog(cog_file)\n",
190+
"\n",
191+
" expected_metadata = {\n",
192+
" 'driver': 'GTiff',\n",
193+
" 'dtype': 'float32',\n",
194+
" 'nodata': -9999.0,\n",
195+
" 'width': 1440,\n",
196+
" 'height': 720,\n",
197+
" 'count': 1,\n",
198+
" 'crs': CRS.from_epsg(4326),\n",
199+
" 'transform': Affine(0.25, 0.0, 0.0, 0.0, 0.25, -90.0),\n",
200+
" }\n",
201+
" reference_file = Path(\n",
202+
" './reference_data',\n",
203+
" 'RSS_smap_SSS_L3_8day_running_2020_005_FNL_v04.0_converted_sss_smap.tiff',\n",
204+
" )\n",
205+
"\n",
206+
" utility.assert_dataset_produced_correct_results(\n",
207+
" cog_file, expected_metadata, reference_file\n",
208+
" )"
209+
]
210+
},
211+
{
212+
"cell_type": "markdown",
213+
"id": "1edc6d53",
214+
"metadata": {},
215+
"source": [
216+
"## Test conversion of ALL variables FAILS\n",
217+
"\n",
218+
"net2cog only supports conversion of a single variable within a netcdf file. This tests that an appropriate error message is shown if more than one variable is specified as input."
219+
]
220+
},
221+
{
222+
"cell_type": "code",
223+
"execution_count": null,
224+
"id": "aa3ec02c",
225+
"metadata": {},
226+
"outputs": [],
227+
"source": [
228+
"if environment_information is not None:\n",
229+
"\n",
230+
" smap_request = Request(\n",
231+
" collection=environment_information['smap_collection'],\n",
232+
" granule_id=environment_information['smap_granules'][0],\n",
233+
" variables=['all'],\n",
234+
" max_results=1,\n",
235+
" format='image/tiff',\n",
236+
" )\n",
237+
" print(harmony_client.request_as_curl(smap_request))\n",
238+
"\n",
239+
" smap_job_id = harmony_client.submit(smap_request)\n",
240+
" raised_expected_error = False\n",
241+
" try:\n",
242+
" harmony_client.wait_for_processing(smap_job_id, show_progress=True)\n",
243+
" except ProcessingFailedException as error:\n",
244+
" assert (\n",
245+
" \"net2cog harmony adapter currently only supports processing one variable at a time\"\n",
246+
" in str(error)\n",
247+
" )\n",
248+
" raised_expected_error = True\n",
249+
" print(error)\n",
250+
"\n",
251+
" assert (\n",
252+
" raised_expected_error\n",
253+
" ), 'Expected request to raise an exception but it did not.'\n",
254+
" utility.print_success('All variables raised expected error')"
255+
]
256+
},
257+
{
258+
"cell_type": "code",
259+
"execution_count": null,
260+
"id": "a0caf714",
261+
"metadata": {},
262+
"outputs": [],
263+
"source": []
264+
}
265+
],
266+
"metadata": {
267+
"celltoolbar": "Tags",
268+
"kernelspec": {
269+
"display_name": "Python 3 (ipykernel)",
270+
"language": "python",
271+
"name": "python3"
272+
},
273+
"language_info": {
274+
"codemirror_mode": {
275+
"name": "ipython",
276+
"version": 3
277+
},
278+
"file_extension": ".py",
279+
"mimetype": "text/x-python",
280+
"name": "python",
281+
"nbconvert_exporter": "python",
282+
"pygments_lexer": "ipython3",
283+
"version": "3.10.14"
284+
}
285+
},
286+
"nbformat": 4,
287+
"nbformat_minor": 5
288+
}
1.98 MB
Binary file not shown.

test/net2cog/utility.py

Lines changed: 47 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,47 @@
1+
"""Simple utility functions used in the net2cog test notebook."""
2+
3+
from pathlib import Path
4+
import rasterio
5+
import subprocess
6+
from numpy.testing import assert_array_almost_equal
7+
8+
9+
def print_error(error_string: str) -> str:
10+
"""Print an error, with formatting for red text."""
11+
print(f'\033[91m{error_string}\033[0m')
12+
13+
14+
def print_success(success_string: str) -> str:
15+
"""Print a success message, with formatting for green text."""
16+
print(f'\033[92mSuccess: {success_string}\033[0m')
17+
18+
19+
def assert_dataset_produced_correct_results(
20+
generated_file: Path, expected_metadata: dict, reference_file: Path
21+
) -> None:
22+
"""Check that the generated data matches the expected data."""
23+
with rasterio.open(generated_file) as test_dataset:
24+
assert (
25+
test_dataset.meta == expected_metadata
26+
), f'output has incorrect metadata: {test_dataset.meta}'
27+
print_success('Generated image has correct metadata.')
28+
29+
with rasterio.open(reference_file) as reference_dataset:
30+
ref_image = reference_dataset.read()
31+
test_image = test_dataset.read()
32+
assert_array_almost_equal(ref_image, test_image)
33+
34+
print_success('Generated image contains correct data.')
35+
36+
37+
def validate_cog(path: Path) -> None:
38+
cogtif_val = ['rio', 'cogeo', 'validate', f'{path}']
39+
40+
process = subprocess.run(
41+
cogtif_val, check=True, stdout=subprocess.PIPE, universal_newlines=True
42+
)
43+
cog_test = process.stdout
44+
cog_test = cog_test.replace("\n", "")
45+
46+
valid_cog = f"{path} is a valid cloud optimized GeoTIFF"
47+
assert cog_test == valid_cog

test/net2cog/version.txt

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
0.1.0

0 commit comments

Comments
 (0)