-
Notifications
You must be signed in to change notification settings - Fork 33
Open
Labels
Description
https://explorer.dea.ga.gov.au/products/ga_ls9c_ard_3/datasets/abe45d2b-a54f-468d-98ea-2ffb30656260
Perhaps a wrapper can be written for NCI to handle these /g/data
files and translate them to the corresponding THREDDS location? Not sure if the paths are always consistently 1:1 from /g/data
and THREDDS, though. 🤷
NCI location | file:///g/data/xu18/ga/ga_ls9c_ard_3/105/068/2022/10/06/ga_ls9c_nbart_3-2-1_105068_2022-10-06_final_thumbnail.jpg |
THREDDS location | https://dapds00.nci.org.au/thredds/fileServer/xu18/ga_ls9c_ard_3/105/068/2022/10/06/ga_ls9c_nbart_3-2-1_105068_2022-10-06_final_thumbnail.jpg |
Catalog | https://dapds00.nci.org.au/thredds/catalog/xu18/ga_ls9c_ard_3/105/068/2022/10/06/catalog.html |
This seems to be at least possible in principle, because that seems to be done for AWS - mapping from S3://dea-public-data/...
-> https://dea-public-data.s3.ap-southeast-2.amazonaws.com/...
Perhaps, while we're at it we could replace the location link with the THREDDS ones too:
Looks like it's handled here in the code:
datacube-explorer/cubedash/_utils.py
Lines 148 to 182 in e09884a
def as_external_url( | |
url: str, s3_region: str = None, is_base: bool = False | |
) -> Optional[str]: | |
""" | |
Convert a URL to an externally-visible one. | |
>>> import pytest; pytest.skip() # doctests aren't working outside flask context :( | |
>>> # Converts s3 to http | |
>>> as_external_url('s3://some-data/L2/S2A_OPER_MSI_ARD__A030100_T56LNQ_N02.09/ARD-METADATA.yaml', "ap-southeast-2") | |
'https://some-data.s3.ap-southeast-2.amazonaws.com/L2/S2A_OPER_MSI_ARD__A030100_T56LNQ_N02.09/ARD-METADATA.yaml' | |
>>> # Other URLs are left as-is | |
>>> unconvertable_url = 'file:///g/data/xu18/ga_ls8c_ard_3-1-0_095073_2019-03-22_final.odc-metadata.yaml' | |
>>> unconvertable_url == as_external_url(unconvertable_url) | |
True | |
>>> as_external_url('some/relative/path.txt') | |
'some/relative/path.txt' | |
>>> # if base uri was none, we may want to return the s3 location instead of the metadata yaml | |
""" | |
parsed = urlparse(url) | |
if s3_region and parsed.scheme == "s3": | |
# get buckets for which link should be to data location instead of s3 link | |
data_location = flask.current_app.config.get("SHOW_DATA_LOCATION", {}) | |
if parsed.netloc in data_location: | |
# remove the first '/' | |
path = parsed.path[1:] | |
if is_base: | |
# if it's the folder url, get the directory path | |
path = path[: path.rindex("/") + 1] | |
path = f"?prefix={path}" | |
return f"https://{data_location.get(parsed.netloc)}/{path}" | |
return f"https://{parsed.netloc}.s3.{s3_region}.amazonaws.com{parsed.path}" | |
return url |
jeremyh