-
Notifications
You must be signed in to change notification settings - Fork 44
Issue #761 better diff for apex reference check #765
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Issue #761 better diff for apex reference check #765
Conversation
openeo/testing/results.py
Outdated
:param atol: absolute tolerance | ||
:raises AssertionError: if not equal within the given tolerance | ||
|
||
.. versionadded:: 0.31.0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
next version will probably be 0.41.0
openeo/testing/results.py
Outdated
data_max = diff_data.max().item() | ||
if data_max == 0: | ||
data_max = 1 | ||
grayscale_characters = "$@B%8&WM#*oahkbdpqwmZO0QLCJUYXzcvunxrjft/\|()1{}[]?-_+~<>i!lI;:,\"^`'. " |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just curious: where does this "gradient" come from?
also note that with the (github.com) font I'm currently looking looking at this, it doesn't seem to be monotonically increasing in "lightness": e.g. +
comes after _
but it looks "darker"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I also noticed this warning when running tests:
.../openeo-python-client/openeo/testing/results.py:99: SyntaxWarning: invalid escape sequence '\|'
grayscale_characters = "$@B%8&WM#*oahkbdpqwmZO0QLCJUYXzcvunxrjft/\|()1{}[]?-_+~<>i!lI;:,\"^`'. "
openeo/testing/results.py
Outdated
def pixelChar(v) -> str: | ||
if np.isnan(v): | ||
return " " | ||
i = int(v * 69 / data_max) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this probably only works for positive values.
note that it's not uncommon to have EO data that is mixed negative-positive (e.g NDVI) or striclty negative (e.g. decibel scaled radar data)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
also: if you are going to use this for difference data, you are probably working with mixed negative-positive data anyway
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it's an absolute diff
- Compare actual and expected data with `xarray.testing.assert_allclose` and specified tolerances. | ||
|
||
:return: list of issues (empty if no issues) | ||
""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you document what the difference is with the existing _compare_xarray_dataarray
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
or isn't it possible to integrate this feature in the existing _compare_xarray_dataarray
instead of duplicating most of it?
tests/testing/test_results.py
Outdated
], | ||
), | ||
( | ||
xarray.DataArray([[1], [2], [3]]), | ||
[ | ||
"Dimension mismatch: ('dim_0', 'dim_1') != ('dim_0',)", | ||
"Shape mismatch: (3, 1) != (3,)", | ||
dirty_equals.IsStr(regex="Left and right DataArray objects are not close.*", regex_flags=re.DOTALL), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't completely understand at the moment why these issues are not triggered anymore with the new implementation
tests/testing/test_results.py
Outdated
dirty_equals.IsStr( | ||
regex=r"Left and right DataArray objects are not close.*Differing dimensions:.*\(y: 2, x: 3\) != \(x: 2, y: 3\)", | ||
regex_flags=re.DOTALL, | ||
), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks like a regression: there is interesting feedback being removed here
tests/testing/test_results.py
Outdated
], | ||
), | ||
( | ||
xarray.DataArray([[1, 2, 3], [4, 5, 6]], dims=["x", "z"]), | ||
[ | ||
"Dimension mismatch: ('x', 'z') != ('x', 'y')", | ||
dirty_equals.IsStr( | ||
regex=r"Left and right DataArray objects are not close.*Differing dimensions:.*\(x: 2, z: 3\) != \(x: 2, y: 3\)", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same
r"t 2: differing pixels: 4/20 \(20.0%\), spread over 8.3% of the area" | ||
): | ||
assert_job_results_allclose(actual=actual_dir, expected=expected_dir, tmp_path=tmp_path) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you also add a simple test for ascii_art
(e.g. with a simple 10 by 5 use case or something like that).
It's currently a public function, so people/projects might start depending on it
openeo/testing/results.py
Outdated
|
||
art = ascii_art(diff_data) | ||
print(f"Difference ascii art for {key}") | ||
print(art) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I also have my doubts about using a raw print
here, because that might not play well with pytest based reporting (e.g. we also try to produce a HTML report in APEx).
I guess it's a matter of experimenting on what currently happens here, or what alternatives are possible
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I started experimenting with this, and got in trouble with this new scipy dependency
openeo/testing/results.py
Outdated
import xarray | ||
import xarray.testing | ||
from scipy.spatial import ConvexHull |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This makes scipy a runtime dependency (now it's just a test dependency). Scipy is quite a heavy dependency (in terms of compiled extensions and additional transitive dependencies), which we we generally want to avoid.
If we keep it as optional dependency, this should be documented, e.g. here:
openeo-python-client/docs/installation.rst
Lines 82 to 95 in c6bc1c9
Optional dependencies | |
====================== | |
Depending on your use case, you might also want to install some additional libraries. | |
For example: | |
- ``netCDF4`` or ``h5netcdf`` for loading and writing NetCDF files (e.g. integrated in ``xarray.load_dataset()``) | |
- ``matplotlib`` for visualisation (e.g. integrated plot functionality in ``xarray`` ) | |
- ``pyarrow`` for (read/write) support of Parquet files | |
(e.g. with :py:class:`~openeo.extra.job_management.MultiBackendJobManager`) | |
- ``rioxarray`` for GeoTIFF support in the assert helpers from ``openeo.testing.results`` | |
- ``geopandas`` for working with dataframes with geospatial support, | |
(e.g. with :py:class:`~openeo.extra.job_management.MultiBackendJobManager`) | |
- ``pystac_client`` for creating a STAC API Job Database (e.g. with :py:class:`~openeo.extra.job_management.stac_job_db.STACAPIJobDatabase`) |
Or, is it actually necessary to have scipy as hard dependency here? Is that convex hull feature vital for the reporting quality? I think there could be a simpler fallback fraction-of-area calculation if ConvexHull is not available
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for example: you use the convex hull to estimate the impacted area of coordinates with differences, but a much simpler/coarser approximation is just taking the min/max values of these coordinates to get the bounding box. Which does not require scipy
openeo/testing/results.py
Outdated
data_max = diff_data.max().item() | ||
if data_max == 0: | ||
data_max = 1 | ||
grayscale_characters = "$@B%8&WM#*oahkbdpqwmZO0QLCJUYXzcvunxrjft/\|()1{}[]?-_+~<>i!lI;:,\"^`'. " |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I also noticed this warning when running tests:
.../openeo-python-client/openeo/testing/results.py:99: SyntaxWarning: invalid escape sequence '\|'
grayscale_characters = "$@B%8&WM#*oahkbdpqwmZO0QLCJUYXzcvunxrjft/\|()1{}[]?-_+~<>i!lI;:,\"^`'. "
openeo/testing/results.py
Outdated
@@ -88,6 +91,129 @@ def _as_xarray_dataarray(data: Union[str, Path, xarray.DataArray]) -> xarray.Dat | |||
return data | |||
|
|||
|
|||
def ascii_art(diff_data: DataArray) -> str: | |||
scale: int = max(1, int(diff_data.sizes["x"] / 100)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
as noted in #761 (comment) I think a size of 100 is too much
I would limit it to e.g. 40x40 or something like that
and while at it: just make this max size/width a function argument (with a reasonable default)
… with bbox, _compare_xarray_dataarray_xy now only adds to the original xarray implementation)
No description provided.