+ "markdown": "---\ntitle: \"How do I find data with NASA's CMR-STAC API in R\"\n---\n\n\n\n**This tutorial demonstrates how to interact with CMR-STAC in R.**\n\nSearching NASA Earthdata is powered the NASA [Common Metadata Repository (CMR)](https://cmr.earthdata.nasa.gov/search), which is a metadata system that catalogs NASA Earth Science data. The CMR allows users to search and discover data collections through various means, including [Earthdata Search](https://search.earthdata.nasa.gov/search), an [Application Programming Interface (API)](https://cmr.earthdata.nasa.gov/search/site/docs/search/api.html), and [SpatioTemporal Asset Catalog (STAC)](https://www.earthdata.nasa.gov/about/esdis/esco/standards-practices/stac).\n\nThis tutorial will teach you how to navigate and explore NASA's [SpatioTemporal Asset Catalog (STAC)](https://www.earthdata.nasa.gov/about/esdis/esco/standards-practices/stac) using R, to find and learn about the datasets available through NASA's different cloud archives. We'll demonstrate by using it to search for [ASTER Global DEM data](https://www.earthdata.nasa.gov/data/catalog/lpcloud-astgtm-003) available in the LP DAAC's Cumulus cloud archive.\n\n------------------------------------------------------------------------\n\n### Topics Covered in this Tutorial\n\n1. **Introduction to STAC and the CMR-STAC API**\\\n 1a. What is STAC?\\\n 1b. Why STAC?\\\n 1c. What is the CMR-STAC API?\\\n2. **Get started & searching with CMR-STAC**\\\n 2a. CMR-STAC API\\\n 2b. STAC Collection\\\n 2c. STAC Item\\\n 2d. Assets\n3. **Visualize a STAC Item**\n\n------------------------------------------------------------------------\n\n### Required packages\n\n- **Required packages:**\n\n - `dplyr`: manipulate data frames of stac search results\n - `DT`: make returned information more readable\n - `rstac`: interact with STAC catalogs\n - `devtools`: install development version of packages\n - `terra`: open DEM data once it has been discovered via STAC search\n\nRun the cells below to install (or update) the necessary packages, and then load all of the required packages.\n\n\n::: {.cell}\n\n```{.r .cell-code}\ninstall.packages(c('dplyr', 'DT', 'rstac', 'devtools', 'terra'))\n```\n:::\n\n\n\n::: {.cell}\n\n```{.r .cell-code}\nlibrary(dplyr)\n```\n\n::: {.cell-output .cell-output-stderr}\n\n```\n\nAttaching package: 'dplyr'\n```\n\n\n:::\n\n::: {.cell-output .cell-output-stderr}\n\n```\nThe following objects are masked from 'package:stats':\n\n filter, lag\n```\n\n\n:::\n\n::: {.cell-output .cell-output-stderr}\n\n```\nThe following objects are masked from 'package:base':\n\n intersect, setdiff, setequal, union\n```\n\n\n:::\n\n```{.r .cell-code}\nlibrary(DT)\nlibrary(rstac)\nlibrary(terra)\n```\n\n::: {.cell-output .cell-output-stderr}\n\n```\nterra 1.8.60\n```\n\n\n:::\n:::\n\n\nFor listing the various STAC catalogs associated with NASA CMR, we'll install the development version of the `earthdatalogin` package in the following cell. The development version contains functions that will be utilized in section 2a.\n\n\n::: {.cell}\n\n```{.r .cell-code}\ndevtools::install_github(\"boettiger-lab/earthdatalogin\")\nlibrary(\"earthdatalogin\")\n```\n:::\n\n\n------------------------------------------------------------------------\n\n# 1. Introduction to STAC and the CMR-STAC API\n\n## 1a. What is STAC?\n\nSTAC is short for [Spatiotemporal Asset Catalog](http://stacspec.org/), a series of specifications that standardize indexing and discovery of `spatiotemporal assets` (files containing information about the Earth across space and time).\n\nThere are four specifications that work both independently and together:\n\n1) [STAC Catalog](https://github.com/radiantearth/stac-spec/blob/master/catalog-spec/catalog-spec.md): a simple, flexible JSON file of links that provides a structure to organize and browse STAC Items.\n2) [STAC Collection](https://github.com/radiantearth/stac-spec/blob/master/collection-spec/collection-spec.md): an extension of the STAC Catalog with additional information such as the extents, license, keywords, providers, etc that describe STAC Items that fall within the Collection.\n3) [STAC Item](https://github.com/radiantearth/stac-spec/blob/master/item-spec/item-spec.md): the core atomic unit, representing a single spatiotemporal asset as a GeoJSON feature plus datetime and links.\n4) [STAC API](https://github.com/radiantearth/stac-api-spec): a RESTful endpoint that enables search of STAC Items, specified in OpenAPI, following OGC's WFS 3. \n\nSource: [stacspec.org](stacspec.org)\n\n## 1b. Why STAC?\n\nSTAC is commonly used in cloud environments to catalog and index large datasets, making them more accessible for analysis and visualization. Many other organizations, such as the[US Geological Survey (USGS)](https://www.usgs.gov/landsat-missions/spatiotemporal-asset-catalog-stac), [Microsoft](https://planetarycomputer.microsoft.com/), and [the European Space Agency](https://browser.apex.esa.int/), have adopted STAC as a standard for organizing and sharing geospatial data.\n\n------------------------------------------------------------------------\n\n## 1c. What is the CMR-STAC API?\n\nThe Common Metadata Repository (CMR) is a metadata system that catalogs Earth Science data and associated metadata records. NASA's CMR-STAC Application Programming Interface (API) is a translation API for STAC users who want to access and search through CMR's vast metadata holdings using STAC keywords.\n\n------------------------------------------------------------------------\n\n# 2. Get started with CMR-STAC\n\n## 2a. CMR-STAC API\n\nThe CMR-STAC API contains endpoints that enable the querying of STAC items.\n\nHere, we will use the function `list_nasa_stacs`, which connects to the CMR-STAC landing page (https://cmr.earthdata.nasa.gov/cloudstac/) for cloud datasets. The landing page contains all the available cloud data providers and their STAC endpoint. \n\n\n::: {.cell}\n\n```{.r .cell-code}\ncmr_cat_links <- earthdatalogin::list_nasa_stacs()\ncmr_cat_links\n```\n\n::: {.cell-output .cell-output-stdout}\n\n```\n title href\n5 SCIOPS https://cmr.earthdata.nasa.gov/cloudstac/SCIOPS\n6 ASF https://cmr.earthdata.nasa.gov/cloudstac/ASF\n7 GHRC_DAAC https://cmr.earthdata.nasa.gov/cloudstac/GHRC_DAAC\n8 NSIDC_CPRD https://cmr.earthdata.nasa.gov/cloudstac/NSIDC_CPRD\n9 CSDA https://cmr.earthdata.nasa.gov/cloudstac/CSDA\n10 LAADS https://cmr.earthdata.nasa.gov/cloudstac/LAADS\n11 GES_DISC https://cmr.earthdata.nasa.gov/cloudstac/GES_DISC\n12 OB_CLOUD https://cmr.earthdata.nasa.gov/cloudstac/OB_CLOUD\n13 LARC_CLOUD https://cmr.earthdata.nasa.gov/cloudstac/LARC_CLOUD\n14 LPCLOUD https://cmr.earthdata.nasa.gov/cloudstac/LPCLOUD\n15 POCLOUD https://cmr.earthdata.nasa.gov/cloudstac/POCLOUD\n16 ORNL_CLOUD https://cmr.earthdata.nasa.gov/cloudstac/ORNL_CLOUD\n17 ALL https://cmr.earthdata.nasa.gov/cloudstac/ALL\n```\n\n\n:::\n:::\n\n\nThe data frame above shows all the data providers with their associated STAC catalog endpoints. You will notice above that the CMR-STAC API contains many different endpoints--not just from NASA LP DAAC, but also contains endpoints for other NASA ESDIS DAACs. Use the `title` field to identify the data provider you are interested in. The data product used in this tutorial is hosted in the LP DAAC Cumulus Cloud space (LPCLOUD).\n\nLet's get the associated endpoint for LPCLOUD.\n\n\n::: {.cell}\n\n```{.r .cell-code}\nprovider <- 'LPCLOUD'\nlpcloud_cat_link <- earthdatalogin::get_nasa_stac_url(provider)\nlpcloud_cat_link\n```\n\n::: {.cell-output .cell-output-stdout}\n\n```\n[1] \"https://cmr.earthdata.nasa.gov/cloudstac/LPCLOUD\"\n```\n\n\n:::\n:::\n\n\n------------------------------------------------------------------------\n\n## 2b. STAC Collection\n\nSTAC Collection is an extension of STAC Catalog containing additional information that describes the STAC Items in that Collection.\n\nLet's begin using the `rstac` package. `rstac` is a library that allows users to interact with STAC Catalogs and their associated data. Using the LPCLOUD link, we'll use `rstac` to query the LPCLOUD catalog and retrieve the content describing associated collections. Important information such as data collection ID and the title are provided here.\n\n\n::: {.cell}\n\n```{.r .cell-code}\nlpcloud_collections <- stac(lpcloud_cat_link) |>\n collections() |>\n get_request()\nlpcloud_collections\n```\n\n::: {.cell-output .cell-output-stdout}\n\n```\n###Collections\n- collections (20 item(s)):\n - HLSS30_2.0\n - HLSL30_2.0\n - SRTMGL1_003\n - MCD43A3_061\n - MYD09GQ_061\n - MCD43A2_061\n - MCD43A4_061\n - MOD11A1_061\n - MYD11A1_061\n - MOD09GQ_061\n - ... with 10 more collection(s).\n- field(s): description, links, collections\n```\n\n\n:::\n\n```{.r .cell-code}\ncollection_info <- lapply(lpcloud_collections$collections, function(x) {\n data.frame(id = x$id, title = x$title)\n}) |>\n dplyr::bind_rows()\ncollection_info\n```\n\n::: {.cell-output .cell-output-stdout}\n\n```\n id\n1 HLSS30_2.0\n2 HLSL30_2.0\n3 SRTMGL1_003\n4 MCD43A3_061\n5 MYD09GQ_061\n6 MCD43A2_061\n7 MCD43A4_061\n8 MOD11A1_061\n9 MYD11A1_061\n10 MOD09GQ_061\n11 MOD11_L2_061\n12 MOD14_061\n13 MOD11B1_061\n14 MYD14_061\n15 MCD19A2_061\n16 MCD19A1_061\n17 MCD19A3D_061\n18 MCD43A1_061\n19 GEDI02_A_002\n20 MOD21A1D_061\n title\n1 HLS Sentinel-2 Multi-spectral Instrument Surface Reflectance Daily Global 30m v2.0\n2 HLS Landsat Operational Land Imager Surface Reflectance and TOA Brightness Daily Global 30m v2.0\n3 NASA Shuttle Radar Topography Mission Global 1 arc second V003\n4 MODIS/Terra+Aqua BRDF/Albedo Albedo Daily L3 Global - 500m V061\n5 MODIS/Aqua Surface Reflectance Daily L2G Global 250m SIN Grid V061\n6 MODIS/Terra+Aqua BRDF/Albedo Quality Daily L3 Global - 500m V061\n7 MODIS/Terra+Aqua BRDF/Albedo Nadir BRDF-Adjusted Ref Daily L3 Global - 500m V061\n8 MODIS/Terra Land Surface Temperature/Emissivity Daily L3 Global 1km SIN Grid V061\n9 MODIS/Aqua Land Surface Temperature/Emissivity Daily L3 Global 1km SIN Grid V061\n10 MODIS/Terra Surface Reflectance Daily L2G Global 250m SIN Grid V061\n11 MODIS/Terra Land Surface Temperature/Emissivity 5-Min L2 Swath 1km V061\n12 MODIS/Terra Thermal Anomalies/Fire 5-Min L2 Swath 1km V061\n13 MODIS/Terra Land Surface Temperature/Emissivity Daily L3 Global 6km SIN Grid V061\n14 MODIS/Aqua Thermal Anomalies/Fire 5-Min L2 Swath 1km V061\n15 MODIS/Terra+Aqua Land Aerosol Optical Depth Daily L2G Global 1km SIN Grid V061\n16 MODIS/Terra+Aqua Land Surface BRF Daily L2G Global 500m and 1km SIN Grid V061\n17 MODIS/Terra+Aqua BRDF Model Parameters Daily L3 Global 1km SIN Grid V061\n18 MODIS/Terra+Aqua BRDF/Albedo Model Parameters Daily L3 Global - 500m V061\n19 GEDI L2A Elevation and Height Metrics Data Global Footprint Level V002\n20 MODIS/Terra Land Surface Temperature/3-Band Emissivity Daily L3 Global 1km SIN Grid Day V061\n```\n\n\n:::\n:::\n\n\nIn CMR, the Collection ID is used to query by a specific product, so be sure to save the ID for a collection you are interested in. For instance, the Collection ID for ASTER Global Digital Elevation Model V003 is `ASTGTM_003`. Note that the \"id\" shortname is in the format: productshortname_VVV (where VVV = product version).\n\nHere, we use the short name `ASTGTM_003` to query the STAC Collection. If you are interested in querying a different LPCLOUD product, swap out the shortname to assign to the `collection` variable below.\n\nUsers can also define other parameters such as a temporal and spatial extent. Notice the `limit` parameter in the `body` object. This parameter allows us to adjust the number of records returned during a request (default = 10).\n\n\n::: {.cell}\n\n```{.r .cell-code}\n# define search parameters\ncollection <- 'ASTGTM_003'\ndatetime <- '2000-01-01T00:00:00Z/2001-01-31T23:59:59Z' #YYYY-MM-DDTHH:MM:SSZ/YYYY-MM-DDTHH:MM:SSZ\nbbox <- c(\n -122.0622682571411,\n 39.897234301806,\n -122.04918980598451,\n 39.91309383703065\n) # LL and UR Coordinates\n\n# search\ncollection_search <- stac(\n lpcloud_cat_link\n) |>\n stac_search(\n limit = 100,\n collections = collection,\n bbox = bbox,\n datetime = datetime\n ) |>\n get_request()\ncollection_search\n```\n\n::: {.cell-output .cell-output-stdout}\n\n```\n###Items\n- matched feature(s): 1\n- features (1 item(s) / 0 not fetched):\n - ASTGTMV003_N39W123\n- assets: \n003/ASTGTMV003_N39W123_dem, 003/ASTGTMV003_N39W123_num, browse, metadata, s3_003/ASTGTMV003_N39W123_dem, s3_003/ASTGTMV003_N39W123_num, thumbnail_0, thumbnail_1\n- item's fields: \nassets, bbox, collection, geometry, id, links, properties, stac_extensions, stac_version, type\n```\n\n\n:::\n:::\n\n\nWe can see that the output includes the number of items that fall within the search criteria, the assets, and fields.\n\n------------------------------------------------------------------------\n\n## 2c. STAC Item\n\nSTAC Items represent data and metadata assets that are spatiotemporally coincident. Using our STAC Query, let's get the first item from our STAC collection search in the above cell.\n\n\n::: {.cell}\n\n```{.r .cell-code}\nfirst_item <- collection_search$features[[1]]\nfirst_item\n```\n\n::: {.cell-output .cell-output-stdout}\n\n```\n###Item\n- id: ASTGTMV003_N39W123\n- collection: ASTGTM_003\n- bbox: xmin: -123.00014, ymin: 38.99986, xmax: -121.99986, ymax: 40.00014\n- datetime: 2000-03-01T00:00:00.000Z\n- assets: \nbrowse, thumbnail_0, thumbnail_1, 003/ASTGTMV003_N39W123_dem, 003/ASTGTMV003_N39W123_num, s3_003/ASTGTMV003_N39W123_dem, s3_003/ASTGTMV003_N39W123_num, metadata\n- item's fields: \nassets, bbox, collection, geometry, id, links, properties, stac_extensions, stac_version, type\n```\n\n\n:::\n:::\n\n\n------------------------------------------------------------------------\n\n## 2d. Assets\n\nThe STAC Item ID (CMR Granule ID) is the unique identifier assigned to each granule within a data collection. Within each STAC Item are assets, which include the downloadable and streamable URL to data files along with other asset objects. Below, the first Granule ID is used to get associated files, which can be found in the 'href' field.\n\n\n::: {.cell}\n\n```{.r .cell-code}\nitem_assets <- first_item$assets\n\nfor (asset_name in names(item_assets)) {\n cat(\"Asset:\", asset_name, \"\\n\")\n cat(\"URL:\", item_assets[[asset_name]]$href, \"\\n\\n\")\n}\n```\n\n::: {.cell-output .cell-output-stdout}\n\n```\nAsset: browse \nURL: https://data.lpdaac.earthdatacloud.nasa.gov/lp-prod-public/ASTGTM.003/ASTGTMV003_N39W123.1.jpg \n\nAsset: thumbnail_0 \nURL: https://data.lpdaac.earthdatacloud.nasa.gov/lp-prod-public/ASTGTM.003/ASTGTMV003_N39W123.1.jpg \n\nAsset: thumbnail_1 \nURL: s3://lp-prod-public/ASTGTM.003/ASTGTMV003_N39W123.1.jpg \n\nAsset: 003/ASTGTMV003_N39W123_dem \nURL: https://data.lpdaac.earthdatacloud.nasa.gov/lp-prod-protected/ASTGTM.003/ASTGTMV003_N39W123_dem.tif \n\nAsset: 003/ASTGTMV003_N39W123_num \nURL: https://data.lpdaac.earthdatacloud.nasa.gov/lp-prod-protected/ASTGTM.003/ASTGTMV003_N39W123_num.tif \n\nAsset: s3_003/ASTGTMV003_N39W123_dem \nURL: s3://lp-prod-protected/ASTGTM.003/ASTGTMV003_N39W123_dem.tif \n\nAsset: s3_003/ASTGTMV003_N39W123_num \nURL: s3://lp-prod-protected/ASTGTM.003/ASTGTMV003_N39W123_num.tif \n\nAsset: metadata \nURL: https://cmr.earthdata.nasa.gov/search/concepts/G1726750594-LPCLOUD.xml \n```\n\n\n:::\n:::\n\n\n------------------------------------------------------------------------\n\n# 3. Visualize a STAC Item\n\nNow that we have successfully navigated the CMR-STAC catalog and found the asset URLs, we can download and visualize the ASTER DEM data using the `terra` package. The `vsi = TRUE` argument allows us to read the data directly from the cloud (i.e., stream it) without downloading it locally.\nWe first need to authenticate using our Earthdata Login credentials using the [`edl_netrc()`](https://boettiger-lab.github.io/earthdatalogin/reference/edl_netrc.html) function. If you do not have an Earthdata Login, you can create one [here](https://urs.earthdata.nasa.gov/users/new).\n\n\n::: {.cell}\n\n```{.r .cell-code}\n# Authenticate with Earthdata Login\nearthdatalogin::edl_netrc()\n\ndem_url <- item_assets$`003/ASTGTMV003_N39W123_dem`$href\ndem <- rast(dem_url, vsi = TRUE)\n\nplot(dem)\n```\n\n::: {.cell-output-display}\n{width=672}\n:::\n:::\n\n\n------------------------------------------------------------------------\n\n### Summary\nIn this tutorial, you learned how to navigate and explore the CMR-STAC Catalog using the CMR-STAC Search endpoint, which allows user to quickly search for STAC Items that meet their specific spatial, temporal, and data product requirements.\n\nAdditional Resources:\n\n- [Getting Started with Cloud-Native Harmonized Landsat Sentinel-2 (HLS) Data in R](https://github.com/nasa/HLS-Data-Resources/blob/main/r/HLS_Tutorial.Rmd) \n- [Accessing the MAAP STAC with R](https://docs.maap-project.org/en/latest/technical_tutorials/working_with_r/maap_stac_r.html)\n\n------------------------------------------------------------------------\n\n### Contact Information\n\nThis tutorial was updated on September 8, 2025 by the following NASA MAAP Team members: Sheyenne Kirkland^1^, Alex Mandel^2^, and Henry Rodman^2^\\\n\n^1^ The University of Alabama-Huntsville (UAH)\\\n^2^ Development Seed\\\n\nThe original material was written by Mahsa Jami and Aaron Friesz at LP DAAC\\\n",
0 commit comments