import numpy as np
import earthaccess
from shapely.geometry import Polygon, polygon
from shapely.ops import unary_union
import folium
import copy
from branca.element import Figure
Finding Coincident NASA Airborne and Orbital Data
Summary
Often, novel remote sensing research requires utilizing data from multiple instruments. This Jupyter notebook provides an example of finding coincident data from NASA Earth Science missions. Users will learn how to submit mutiple queries to NASA’s CMR using the earthaccess Python library to accomplish this goal. These queries can search the NASA Earthdata Archive of over 120 petabytes using keywords, spatial constraints (point, bounding box, or polygon), and temporal constraints. From a basic search for AVIRIS-3, we take the metadata returned by the query and use it to define spatial and temporal constraints for a second and third search to to find orbital data from the ECOSTRESS and EMIT collections.
Background
The AVIRIS-3 instrument is an airborne imaging spectrometer that measures light in visible and infrared wavelengths. These measurements display unique spectral signatures that correspond to the chemical composition on the Earth’s surface and in the atmospheric column above. There are several applications for this data ranging from surface mineral exploration to agriculture. Recently, the instrument was flown over wildfires in Alabama and provided realtime support to firefighters. More specifics about the AVIRIS-3 mission can be found on the AVIRIS-3 website and AVIRIS-3 dataset landing pages.
The ECOSTRESS instrument is a multispectral thermal imaging radiometer designed to answer three overarching science questions:
- How is the terrestrial biosphere responding to changes in water availability?
- How do changes in diurnal vegetation water stress the global carbon cycle?
- Can agricultural vulnerability be reduced through advanced monitoring of agricultural water consumptive use and improved drought estimation?
The ECOSTRESS mission is answering these questions by accurately measuring the temperature of plants. Plants regulate their temperature by releasing water through tiny pores on their leaves called stomata. If they have sufficient water they can maintain their temperature, but if there is insufficient water, their temperatures rise and this temperature rise can be measured with ECOSTRESS. The images acquired by ECOSTRESS are the most detailed temperature images of the surface ever acquired from space and can be used to measure the temperature of an individual farmers field.
More details about ECOSTRESS and its associated products can be found on the ECOSTRESS website and ECOSTRESS product pages hosted by the Land Processes Distributed Active Archive Center (LP DAAC).
The EMIT instrument is also an imaging spectrometer, located on the international space station. The EMIT mission focuses specifically on mapping the composition of minerals to better understand the effects of mineral dust throughout the Earth system and human populations now and in the future. In addition, the EMIT instrument can be used in other applications, such as mapping of greenhouse gases, snow properties, and water resources.
More details about EMIT and its associated products can be found on the EMIT website and EMIT product pages hosted by the LP DAAC.
Requirements - NASA Earthdata Account
- No Python setup requirements if connected to the workshop cloud instance!
- Local Only - Set up Python Environment - See setup_instructions.md in the /setup/
folder to set up a local compatible Python environment
Tutorial Outline
- Searching for Data
- Wrangling UMM Metadata
- Visualizing Results
- Second Search (ECOSTRESS)
- Third Search (EMIT)
- Selecting Assets
- Downloading and Streaming Data
1. Searching for Data
To search for data, we’ll use the earthaccess
Python library. earthaccess
simplifies the amount of code required to search for data, and handles authentication. You do not need to authenticate to search, but do for downloading or streaming data. We’ll go ahead and use the login
function to add our credentials to this session. This function will retrieve your login info from a .netrc
file if one exists, or prompt you for username and password and create one if you use the persist
argument.
= earthaccess.login(persist=True) auth
For the first step of our initial search for AVIRIS-3 data, we can use earthaccess
to conduct a query for “AVIRIS-3” to find information about the data collections (often referred to as data products or data sets) associated with that keyword. From this search, we will get the unique concept-id
associated with the data collection we want to use, AVIRIS-3 L2A Orthocorrected Surface Reflectance, Facility Instrument Collection. A concept-id
is a unique identifier for a collection, granule, or a service provided by NASA. These types are indicated by a leading C, G, or S respectively. Using the collection concept-id
for AVIRIS-3, we can find granules (scenes) from that collection.
This query returns a list of dictionaries, from which we will specify to return 3 fields: “Shortname”, “EntryTitle”, and “Version”. These fields provide information that will help us select concept-id
we want to use in our search.
# AVIRIS Collection Query
= earthaccess.collection_query().keyword('AVIRIS-3')
aviris_collection_query # Retrieve Relevant Information
'Shortname', 'EntryTitle','Version']).get() aviris_collection_query.fields([
[{
"meta": {
"concept-id": "C3369603199-ORNL_CLOUD",
"granule-count": 511,
"provider-id": "ORNL_CLOUD"
},
"umm": {
"EntryTitle": "AVIRIS-3 L2A Orthocorrected Surface Reflectance, Facility Instrument Collection",
"Version": "1"
}
},
{
"meta": {
"concept-id": "C3236537512-ORNL_CLOUD",
"granule-count": 13239,
"provider-id": "ORNL_CLOUD"
},
"umm": {
"EntryTitle": "AVIRIS-3 L2B Greenhouse Gas Enhancements, Facility Instrument Collection",
"Version": "1"
}
},
{
"meta": {
"concept-id": "C3236537162-ORNL_CLOUD",
"granule-count": 13464,
"provider-id": "ORNL_CLOUD"
},
"umm": {
"EntryTitle": "AVIRIS-3 L1B Calibrated Radiance, Facility Instrument Collection",
"Version": "1"
}
},
{
"meta": {
"concept-id": "C3441567508-ORNL_CLOUD",
"granule-count": 358,
"provider-id": "ORNL_CLOUD"
},
"umm": {
"EntryTitle": "ABoVE: AVIRIS-3 Imaging Spectroscopy for Alaska and Canada, 2023",
"Version": "1"
}
},
{
"meta": {
"concept-id": "C3438248196-LARC_ASDC",
"granule-count": 16,
"provider-id": "LARC_ASDC"
},
"umm": {
"EntryTitle": "SCOAPE-II R/V Point Sur Data",
"Version": "1"
}
},
{
"meta": {
"concept-id": "C3436761049-LARC_ASDC",
"granule-count": 15,
"provider-id": "LARC_ASDC"
},
"umm": {
"EntryTitle": "SCOAPE-II Sondes Data",
"Version": "1"
}
}]
Now that we know the concept-id
for the AVIRIS-3 L2A Orthocorrected Surface Reflectance collection, we can provide it as an argument for our data search.
# Conduct Search
= earthaccess.search_data(concept_id="C3369603199-ORNL_CLOUD", count=1000)
results_airborne print(f"Granules Found: {len(results_airborne)}")
Granules Found: 511
2. Wrangling UMM Metadata
The results from our search for AVIRIS-3 data are returned as a list. By selecting an item in the list and using the umm
key, we can see a nested dictionary of all the metadata for that a specific granule.
0]['umm'] results_airborne[
{'TemporalExtent': {'RangeDateTime': {'BeginningDateTime': '2024-09-05T17:58:21Z',
'EndingDateTime': '2024-09-05T17:58:33Z'}},
'GranuleUR': 'AV320240905t175818_000_L2A_RFL_1',
'AdditionalAttributes': [{'Name': 'SOFTWARE_BUILD_VERSION',
'Values': ['010200']},
{'Name': 'Identifier_product_doi_authority', 'Values': ['https://doi.org']},
{'Name': 'FLIGHTLINE', 'Values': ['AV320240905t175818']},
{'Name': 'SCENE', 'Values': ['000']},
{'Name': 'Campaign', 'Values': ['GHG']},
{'Name': 'SOLAR_ZENITH', 'Values': ['38.28']},
{'Name': 'SOLAR_AZIMUTH', 'Values': ['130.17']}],
'SpatialExtent': {'HorizontalSpatialDomain': {'Geometry': {'GPolygons': [{'Boundary': {'Points': [{'Longitude': -118.48373323690792,
'Latitude': 34.29866517224145},
{'Longitude': -118.4833935787755, 'Latitude': 34.279343233082216},
{'Longitude': -118.4604556995513, 'Latitude': 34.27961880197877},
{'Longitude': -118.46079011010045, 'Latitude': 34.29894093970338},
{'Longitude': -118.48373323690792,
'Latitude': 34.29866517224145}]}}]}}},
'ProviderDates': [{'Date': '2025-01-17T04:10:19Z', 'Type': 'Create'}],
'CollectionReference': {'ShortName': 'AV3_L2A_RFL_2357', 'Version': '1'},
'PGEVersionClass': {'PGEName': 'L2A_OE', 'PGEVersion': 'v1.2.0'},
'RelatedUrls': [{'URL': 'https://data.ornldaac.earthdata.nasa.gov/protected/aviris/AV3_L2A_RFL/data/AV320240905t175818_000_L2A_OE_f576f24d_RFL_ORT.nc',
'Description': 'Download AV320240905t175818_000_L2A_OE_f576f24d_RFL_ORT.nc',
'Type': 'GET DATA'},
{'URL': 'https://data.ornldaac.earthdata.nasa.gov/protected/aviris/AV3_L2A_RFL/data/AV320240905t175818_000_L2A_OE_f576f24d_UNC_ORT.nc',
'Description': 'Download AV320240905t175818_000_L2A_OE_f576f24d_UNC_ORT.nc',
'Type': 'GET DATA'},
{'URL': 'https://data.ornldaac.earthdata.nasa.gov/protected/aviris/AV3_L2A_RFL/data/AV320240905t175818_000_L2A_OE_f576f24d_RFL_ORT_QL.tif',
'Description': 'Download AV320240905t175818_000_L2A_OE_f576f24d_RFL_ORT_QL.tif',
'Type': 'GET DATA'},
{'URL': 'https://data.ornldaac.earthdata.nasa.gov/protected/aviris/AV3_L2A_RFL/data/AV320240905t175818_000_L2A_OE_f576f24d.yaml',
'Description': 'Download AV320240905t175818_000_L2A_OE_f576f24d.yaml',
'Type': 'GET DATA'},
{'URL': 'https://data.ornldaac.earthdata.nasa.gov/public/aviris/AV3_L2A_RFL/data/AV320240905t175818_000_L2A_OE_f576f24d_RFL_ORT_BROWSE.jpg',
'Description': 'Download AV320240905t175818_000_L2A_OE_f576f24d_RFL_ORT_BROWSE.jpg',
'Type': 'GET RELATED VISUALIZATION'},
{'URL': 's3://ornl-cumulus-prod-protected/aviris/AV3_L2A_RFL/data/AV320240905t175818_000_L2A_OE_f576f24d_RFL_ORT.nc',
'Description': 'This link provides direct download access via S3 to the granule',
'Type': 'GET DATA VIA DIRECT ACCESS'},
{'URL': 's3://ornl-cumulus-prod-protected/aviris/AV3_L2A_RFL/data/AV320240905t175818_000_L2A_OE_f576f24d_UNC_ORT.nc',
'Description': 'This link provides direct download access via S3 to the granule',
'Type': 'GET DATA VIA DIRECT ACCESS'},
{'URL': 's3://ornl-cumulus-prod-protected/aviris/AV3_L2A_RFL/data/AV320240905t175818_000_L2A_OE_f576f24d_RFL_ORT_QL.tif',
'Description': 'This link provides direct download access via S3 to the granule',
'Type': 'GET DATA VIA DIRECT ACCESS'},
{'URL': 's3://ornl-cumulus-prod-protected/aviris/AV3_L2A_RFL/data/AV320240905t175818_000_L2A_OE_f576f24d.yaml',
'Description': 'This link provides direct download access via S3 to the granule',
'Type': 'GET DATA VIA DIRECT ACCESS'},
{'URL': 's3://ornl-cumulus-prod-public/aviris/AV3_L2A_RFL/data/AV320240905t175818_000_L2A_OE_f576f24d_RFL_ORT_BROWSE.jpg',
'Description': 'This link provides direct download access via S3 to the granule',
'Type': 'GET RELATED VISUALIZATION'},
{'URL': 'https://data.ornldaac.earthdata.nasa.gov/s3credentials',
'Description': 'api endpoint to retrieve temporary credentials valid for same-region direct s3 access',
'Type': 'VIEW RELATED INFORMATION'}],
'DataGranule': {'DayNightFlag': 'Day',
'ArchiveAndDistributionInformation': [{'Name': 'AV320240905t175818_000_L2A_OE_f576f24d_RFL_ORT.nc',
'SizeInBytes': 1000324914,
'Format': 'netCDF-4',
'Checksum': {'Value': 'ccea4014b5884aca62d3d826b7e757060c1a3599da6aaef9285035c66bf9c7f91452118654f067f0e24ff129e213161266861356e5ce3d99266ab3972c23d383',
'Algorithm': 'SHA-512'}},
{'Name': 'AV320240905t175818_000_L2A_OE_f576f24d_UNC_ORT.nc',
'SizeInBytes': 991189710,
'Format': 'netCDF-4',
'Checksum': {'Value': '0ca7df8fcbc9620b3b5df9d835edce02f5044144ecfd55092f20e51470ca63bc5e6aa81a8a0afddc6aacde005c670fdaf87942c986f6207306d354ee394bbab9',
'Algorithm': 'SHA-512'}},
{'Name': 'AV320240905t175818_000_L2A_OE_f576f24d_RFL_ORT_QL.tif',
'SizeInBytes': 4335896,
'Format': 'GeoTIFF',
'Checksum': {'Value': '4e09c66226bda80056abada717c1f5d58287fde5e2cdf8cfc24918a21d9fd06204b834b9b96281800cda6bf9f21aac5b849a5be746898000700886a62975e336',
'Algorithm': 'SHA-512'}},
{'Name': 'AV320240905t175818_000_L2A_OE_f576f24d_RFL_ORT_BROWSE.jpg',
'SizeInBytes': 296912,
'Format': 'JPEG',
'Checksum': {'Value': 'de07d8f550c8206847b516cda7bdb096cfac0be46c81fb64a1d2657fdf18c822a0bf650a8ce4a728932c18cf8adcb637adce3e0f4538e630976a6b2ef4cf4e97',
'Algorithm': 'SHA-512'}},
{'Name': 'AV320240905t175818_000_L2A_OE_f576f24d.yaml',
'SizeInBytes': 9748,
'Format': 'YAML',
'Checksum': {'Value': '29d95cc290087815358506a34555e56b8cacc73424bd95102c63c1025d34f3a7928a4c2d1d33f694dcbfbd633ae311c3c325e27ee09544abac40f417f381a689',
'Algorithm': 'SHA-512'}}],
'ProductionDateTime': '2025-01-17T03:03:07Z'},
'Platforms': [{'ShortName': 'B-200',
'Instruments': [{'ShortName': 'AVIRIS-3'}]}],
'MetadataSpecification': {'URL': 'https://cdn.earthdata.nasa.gov/umm/granule/v1.6.6',
'Name': 'UMM-G',
'Version': '1.6.6'}}
We can pull information like temporal and spatial extent from the metadata. All data in CMR either have a temporal extent, which can be either a RangeDateTime
or SingleDateTime
. For most airborne and orbital data these are a RangeDateTime
. Based on this knowledge and the above metadata, we can build a function to show us the BeginningDateTime
from all of the scenes. We will then use that function to build a unique list of dates for the airborne campaign. With this list of dates we can plan additional searches.
# Get Unique Dates
def get_beginning_dt(result):
return result['umm']['TemporalExtent']['RangeDateTime']['BeginningDateTime']
= [get_beginning_dt(result) for result in results_airborne]
timestamps = set([timestamp.split("T")[0] for timestamp in timestamps])
unique_dates print(unique_dates)
{'2025-01-23', '2024-09-05', '2025-01-16', '2025-01-11'}
After creating this set of dates, we can use it to group our results. This will help us visualize the data, and set up our strategy for finding orbital data within a specified time window of the flights.
# Group Results By Date
= {
results_by_date
date: [
resultfor result in results_airborne
if date in get_beginning_dt(result)
]for date in unique_dates
}
As mentioned previously, the metadata for each granule also has the spatial extent associated with the acquisition. We can wrangle this into an appropriate format for additional searches, and visualizations.
3. Visualizing Results
For visualizing the results, we’ll import a get_vertices
function from the search_util
module contained in this repository. This function pulls the spatial information out of the results and formats it as a list of polygon vertices so it can be used for folium
plots or searches with earthaccess
.
from modules.search_util import get_vertices
Use folium and our get_vertices
function to plot the geometry of each acquisition, grouped by date.
# Visualize our Airborne Campaign Location
= Figure(width="750px", height="375px")
fig # Create Map
= folium.Map(tiles=None)
m
fig.add_child(m)
# Add Basemap
folium.TileLayer(=(
tiles"https://server.arcgisonline.com/ArcGIS/rest/"
"services/World_Imagery/MapServer/tile/{z}/{y}/{x}"
),="ESRI Satellite",
name="Esri",
attr=False,
overlay=True
control
).add_to(m)
# Define Colormap List (hex) - (https://colorbrewer2.org/#type=qualitative&scheme=Set3&n=12)
= ['#8dd3c7','#ffffb3','#bebada','#fb8072','#80b1d3','#fdb462','#b3de69','#fccde5','#d9d9d9','#bc80bd','#ccebc5','#ffed6f']
cmap
# Add Flightline Polygons by Date
for i, date in enumerate(sorted(results_by_date.keys())):
# Create FeatureGroup for a Date
= folium.FeatureGroup(name=f"{date} AVIRIS-3 Flightlines")
fg = cmap[i % len(cmap)]
color for record in results_by_date[date]:
# Define Tooltip
= {"Granule ID": record['umm']['GranuleUR'], **record['umm']['TemporalExtent']['RangeDateTime']}
tooltip_dict = "<br>".join(f"<b>{k}</b>: {v}" for k, v in tooltip_dict.items())
html # Add Polygon
folium.Polygon(=get_vertices(record, lat_lon=True),
locations=color,
color=2,
weight=.2,
fill_opacity=html
tooltip
).add_to(fg)
fg.add_to(m)# Set Bounds and Add Layer Control Widget
=fg.get_bounds())
m.fit_bounds(bounds
folium.LayerControl().add_to(m) fig
We can pick a date here, for example 2024-09-05 and then search for data collections that fall within the same spatial area during a desired timeframe. Let’s see what ECOSTRESS data that falls within a week of the field and airborne overpass from 2024-09-05.
First, lets simplify our search region. Instead of searching for all the flightline polygons on that date, we can merge intersecting ones to conduct fewer queries. If you have a very large number (100s-1000s), the queries can start to take a long time. Loop over our results on the selected date, retrieving the vertices, and putting them in into a shapely polygon to simplify using a unary union. This process will merge all touching polygons and provide us with a list of merged geometries.
Note: The spatial extent could be further simplified into a convex hull if that makes more sense for your application, or you have a resulting polygon with too many vertices. There is a limit of ~16,000 vertices.
= '2024-09-05' selected_date
= []
polys for result in results_by_date[selected_date]:
= get_vertices(result, lat_lon=False)
coords = polygon.orient(Polygon(coords), sign=1.0)
poly
polys.append(poly)= unary_union(polys) merged
When searching for a polygon using earthaccess
it expects a list of coordinates that form a ring in counter-clockwise order. To accomplish this we’ll make them into a list using list comprehension and the orient
function from shapely.polygon
.
= [list(polygon.orient(poly,sign=1.0).exterior.coords) for poly in list(merged.geoms)] merged_rois
After merging, we’ve reduced the geometries down to 2 features, meaning only 2 search queries are required. Now add these merged polygons to our existing map.
# Add Merged Flightlines to existing figure
for roi in merged_rois:
folium.Polygon(="Merged AVIRIS-3 Granules",
name=[(lat,lon) for lon, lat in roi],
locations='black',
color=f'Merged AVIRIS-3 Granules Footprint: {selected_date}'
tooltip
).add_to(m)
fig
4. Second Search (ECOSTRESS)
Now that we have our spatial constraints for our orbital data search, we just need to retrieve the concept-id for the collection we want, and then select a temporal range that we want our ECOSTRESS results to fall within.
First, search for the ECOSTRESS L2 Tiled collection and retrieve its concept-id.
# ECOSTRESS Collection Query
= earthaccess.collection_query().keyword('ECOSTRESS L2T')
eco_collection_query 'Shortname', 'EntryTitle','Version']).get() eco_collection_query.fields([
[{
"meta": {
"concept-id": "C2076090826-LPCLOUD",
"granule-count": 12900646,
"provider-id": "LPCLOUD"
},
"umm": {
"EntryTitle": "ECOSTRESS Tiled Land Surface Temperature and Emissivity Instantaneous L2 Global 70 m V002",
"Version": "002"
}
},
{
"meta": {
"concept-id": "C2076114664-LPCLOUD",
"granule-count": 686021,
"provider-id": "LPCLOUD"
},
"umm": {
"EntryTitle": "ECOSTRESS Swath Land Surface Temperature and Emissivity Instantaneous L2 Global 70 m V002",
"Version": "002"
}
},
{
"meta": {
"concept-id": "C2090073749-LPCLOUD",
"granule-count": 1748054,
"provider-id": "LPCLOUD"
},
"umm": {
"EntryTitle": "ECOSTRESS Tiled Ancillary NDVI and Albedo L2 Global 70 m V002",
"Version": "002"
}
},
{
"meta": {
"concept-id": "C2074855428-LPCLOUD",
"granule-count": 64112,
"provider-id": "LPCLOUD"
},
"umm": {
"EntryTitle": "ECOSTRESS Gridded Surface Energy Balance Instantaneous L3 Global 70 m V002",
"Version": "002"
}
},
{
"meta": {
"concept-id": "C2074852168-LPCLOUD",
"granule-count": 1319520,
"provider-id": "LPCLOUD"
},
"umm": {
"EntryTitle": "ECOSTRESS Tiled Surface Energy Balance Instantaneous L3 Global 70 m V002",
"Version": "002"
}
},
{
"meta": {
"concept-id": "C2076113037-LPCLOUD",
"granule-count": 390598,
"provider-id": "LPCLOUD"
},
"umm": {
"EntryTitle": "ECOSTRESS Gridded Land Surface Temperature and Emissivity Instantaneous L2 Global 70 m V002",
"Version": "002"
}
},
{
"meta": {
"concept-id": "C2074842795-LPCLOUD",
"granule-count": 1540413,
"provider-id": "LPCLOUD"
},
"umm": {
"EntryTitle": "ECOSTRESS Tiled Top of Atmosphere Calibrated Radiance Instantaneous L2 Global 70 m V002",
"Version": "002"
}
},
{
"meta": {
"concept-id": "C2204555942-LPDAAC_ECS",
"granule-count": 2,
"provider-id": "LPDAAC_ECS"
},
"umm": {
"EntryTitle": "ECOSTRESS Tiled Top of Atmosphere Calibrated Radiance Instantaneous L2 Global 70 m V002",
"Version": "002"
}
}]
Choose a temporal range to search for orbital data. We’ll select +/- 10 days from our first AVIRIS-3 campaign data. We can provide this data to our search as a tuple of datetime strings. These can be provided in the format below, or truncated to just the date in the format YYYY-MM-DD
.
= ('2024-08-25T00:00:00Z','2024-09-15T23:59:59Z') temporal_range
Now we can define our parameters and search for ECOSTRESS data. To do this, we’ll search for each ROI and combine the results. One thing to note here is that since the ROIs we have provided are small relative to ECOSTRESS scenes, the same scene will likely be a result for each search. To avoid having duplicates in our results, we’ll omit granules that are already present in our results.
# Orbital Search Queries
= []
eco_results = set()
seen # Check each ROI
for roi in merged_rois:
= {
search_params "concept_id":"C2076090826-LPCLOUD",
"temporal":temporal_range,
"polygon": roi
}= earthaccess.search_data(**search_params)
results_part # Ensure duplicates are not added
for item in results_part:
if item not in seen:
seen.add(item)
eco_results.append(item)print(f"Granules Found: {len(eco_results)}")
Granules Found: 34
This process can be slow. For better performance, it can be implemented in parallel or using asynchronous requests. Just make sure you’re not overwhelming the service with the quantity of requests.
Import the get_asset_urls
helper function from the search_util
module to get urls associated with results. It can be used to retrieve links to data or browse imagery (quicklooks).
from modules.search_util import get_asset_urls
Now visualize our ECOSTRESS results. We’ll use the above function in our folium
plot to visualize the browse imagery for ECOSTRESS scenes.
Note: Not all data collections have browse imagery for each granule, and some that do are not designed to be plotted using the extent geometry, like we do below.
# Visualize our Airborne Campaign Location
= Figure(width="750px", height="375px")
eco_fig
# Create Map
= folium.Map(tiles=None)
eco_map
eco_fig.add_child(eco_map)
# Add Basemap
folium.TileLayer(=(
tiles"https://server.arcgisonline.com/ArcGIS/rest/"
"services/World_Imagery/MapServer/tile/{z}/{y}/{x}"
),="ESRI Satellite",
name="Esri",
attr=False,
overlay=True
control
).add_to(eco_map)
# Define Colormap List (hex) - (https://colorbrewer2.org/#type=qualitative&scheme=Set3&n=12)
= ['#8dd3c7','#ffffb3','#bebada','#fb8072','#80b1d3','#fdb462','#b3de69','#fccde5','#d9d9d9','#bc80bd','#ccebc5','#ffed6f']
cmap
# Add Flightline Polygons by Date
for i, result in enumerate(eco_results):
# Create tooltip for each feature
= {"Granule ID": result['umm']['GranuleUR'], **result['umm']['TemporalExtent']['RangeDateTime']}
tooltip_meta
# Convert tooltip to html
= "<br>".join(f"<b>{k}</b>: {v}" for k, v in tooltip_meta.items())
html
# Create FeatureGroup
= folium.FeatureGroup(name=tooltip_meta['Granule ID'])
fg
# Get Coordinates and Color
= get_vertices(result, lat_lon=True)
coords = cmap[i % len(cmap)]
color
# Add Polygons
= folium.Polygon(
result_poly =coords,
locations=color,
color=2,
weight=True,
fill=0,
fill_opacity=html
tooltip
).add_to(fg)
fg.add_to(eco_map)
# Add Browse Image
= result_poly.get_bounds()
bounds
folium.raster_layers.ImageOverlay(=get_asset_urls(result, extension=".png", first_only=True),
image=bounds,
bounds=0.75,
opacity=1,
zindex
).add_to(fg)
# Add Merged ROIs from Search
for roi in merged_rois:
=[(lat,lon) for lon, lat in roi],
folium.Polygon(locations='black').add_to(eco_map)
color
# Set Bounds and Add Layer Control Widget
=fg.get_bounds())
eco_map.fit_bounds(bounds
folium.LayerControl().add_to(eco_map) eco_fig
5. Third Search (EMIT)
Similarly to our ECOSTRESS Search, we can do the same for EMIT L2A Reflectance. Conduct a collection query to retrieve the concept-id
, then search for data like we previously did for ecostress.
# ECOSTRESS Collection Query
= earthaccess.collection_query().keyword('EMIT L2A')
emit_collection_query 'Shortname', 'EntryTitle','Version']).get() emit_collection_query.fields([
[{
"meta": {
"concept-id": "C2408750690-LPCLOUD",
"granule-count": 151308,
"provider-id": "LPCLOUD"
},
"umm": {
"EntryTitle": "EMIT L2A Estimated Surface Reflectance and Uncertainty and Masks 60 m V001",
"Version": "001"
}
}]
# Orbital Search Queries
= []
emit_results = set()
seen # Check each ROI
for roi in merged_rois:
= {
search_params "concept_id":"C2408750690-LPCLOUD",
"temporal":temporal_range,
"polygon": roi
}= earthaccess.search_data(**search_params)
results_part # Ensure duplicates are not added
for item in results_part:
if item not in seen:
seen.add(item)
emit_results.append(item)print(f"Granules Found: {len(emit_results)}")
Granules Found: 3
Build the same type of folium
figure.
# Visualize our Airborne Campaign Location
= Figure(width="750px", height="375px")
emit_eco_fig = copy.deepcopy(eco_map)
emit_eco_map
# Remove old Layer Control
for key, child in list(emit_eco_map._children.items()):
if isinstance(child, folium.LayerControl):
emit_eco_map._children.pop(key)
# Add New Map to Figure
emit_eco_fig.add_child(emit_eco_map)
# Define Colormap List (hex) - (https://colorbrewer2.org/#type=qualitative&scheme=Paired&n=12)
= ['#a6cee3','#1f78b4','#b2df8a','#33a02c','#fb9a99','#e31a1c','#fdbf6f','#ff7f00','#cab2d6','#6a3d9a','#ffff99','#b15928']
cmap_emit
# Add Flightline Polygons by Date
for i, result in enumerate(emit_results):
# Create tooltip for each feature
= {"Granule ID": result['umm']['GranuleUR'], **result['umm']['TemporalExtent']['RangeDateTime']}
tooltip_meta
# Convert tooltip to html
= "<br>".join(f"<b>{k}</b>: {v}" for k, v in tooltip_meta.items())
html
# Create FeatureGroup
= folium.FeatureGroup(name=tooltip_meta['Granule ID'])
fg
# Get Coordinates and Color
= get_vertices(result, lat_lon=True)
coords = cmap_emit[i % len(cmap_emit)]
color
# Add Polygons
= folium.Polygon(
result_poly =coords,
locations=color,
color=2,
weight=True,
fill=0,
fill_opacity=html
tooltip
).add_to(fg)
fg.add_to(emit_eco_map)
# No Browse Image for EMIT since its not orthorectified
# Set Bounds and Add Layer Control Widget
=fg.get_bounds())
emit_eco_map.fit_bounds(bounds
folium.LayerControl().add_to(emit_eco_map) emit_eco_fig
6. Selecting Assets
We can view what types of files are available by looking in the RelatedUrls
dictionary.
'2024-09-05'][0]['umm']['RelatedUrls'] results_by_date[
[{'URL': 'https://data.ornldaac.earthdata.nasa.gov/protected/aviris/AV3_L2A_RFL/data/AV320240905t175818_000_L2A_OE_f576f24d_RFL_ORT.nc',
'Description': 'Download AV320240905t175818_000_L2A_OE_f576f24d_RFL_ORT.nc',
'Type': 'GET DATA'},
{'URL': 'https://data.ornldaac.earthdata.nasa.gov/protected/aviris/AV3_L2A_RFL/data/AV320240905t175818_000_L2A_OE_f576f24d_UNC_ORT.nc',
'Description': 'Download AV320240905t175818_000_L2A_OE_f576f24d_UNC_ORT.nc',
'Type': 'GET DATA'},
{'URL': 'https://data.ornldaac.earthdata.nasa.gov/protected/aviris/AV3_L2A_RFL/data/AV320240905t175818_000_L2A_OE_f576f24d_RFL_ORT_QL.tif',
'Description': 'Download AV320240905t175818_000_L2A_OE_f576f24d_RFL_ORT_QL.tif',
'Type': 'GET DATA'},
{'URL': 'https://data.ornldaac.earthdata.nasa.gov/protected/aviris/AV3_L2A_RFL/data/AV320240905t175818_000_L2A_OE_f576f24d.yaml',
'Description': 'Download AV320240905t175818_000_L2A_OE_f576f24d.yaml',
'Type': 'GET DATA'},
{'URL': 'https://data.ornldaac.earthdata.nasa.gov/public/aviris/AV3_L2A_RFL/data/AV320240905t175818_000_L2A_OE_f576f24d_RFL_ORT_BROWSE.jpg',
'Description': 'Download AV320240905t175818_000_L2A_OE_f576f24d_RFL_ORT_BROWSE.jpg',
'Type': 'GET RELATED VISUALIZATION'},
{'URL': 's3://ornl-cumulus-prod-protected/aviris/AV3_L2A_RFL/data/AV320240905t175818_000_L2A_OE_f576f24d_RFL_ORT.nc',
'Description': 'This link provides direct download access via S3 to the granule',
'Type': 'GET DATA VIA DIRECT ACCESS'},
{'URL': 's3://ornl-cumulus-prod-protected/aviris/AV3_L2A_RFL/data/AV320240905t175818_000_L2A_OE_f576f24d_UNC_ORT.nc',
'Description': 'This link provides direct download access via S3 to the granule',
'Type': 'GET DATA VIA DIRECT ACCESS'},
{'URL': 's3://ornl-cumulus-prod-protected/aviris/AV3_L2A_RFL/data/AV320240905t175818_000_L2A_OE_f576f24d_RFL_ORT_QL.tif',
'Description': 'This link provides direct download access via S3 to the granule',
'Type': 'GET DATA VIA DIRECT ACCESS'},
{'URL': 's3://ornl-cumulus-prod-protected/aviris/AV3_L2A_RFL/data/AV320240905t175818_000_L2A_OE_f576f24d.yaml',
'Description': 'This link provides direct download access via S3 to the granule',
'Type': 'GET DATA VIA DIRECT ACCESS'},
{'URL': 's3://ornl-cumulus-prod-public/aviris/AV3_L2A_RFL/data/AV320240905t175818_000_L2A_OE_f576f24d_RFL_ORT_BROWSE.jpg',
'Description': 'This link provides direct download access via S3 to the granule',
'Type': 'GET RELATED VISUALIZATION'},
{'URL': 'https://data.ornldaac.earthdata.nasa.gov/s3credentials',
'Description': 'api endpoint to retrieve temporary credentials valid for same-region direct s3 access',
'Type': 'VIEW RELATED INFORMATION'}]
The get_asset_urls
retrieves just the URLs from this dictionary using string matching to help filter what assets we want. For example we can look at examples of all of the links:
'2024-09-05'][0]) get_asset_urls(results_by_date[
['https://data.ornldaac.earthdata.nasa.gov/protected/aviris/AV3_L2A_RFL/data/AV320240905t175818_000_L2A_OE_f576f24d_RFL_ORT.nc',
'https://data.ornldaac.earthdata.nasa.gov/protected/aviris/AV3_L2A_RFL/data/AV320240905t175818_000_L2A_OE_f576f24d_UNC_ORT.nc',
'https://data.ornldaac.earthdata.nasa.gov/protected/aviris/AV3_L2A_RFL/data/AV320240905t175818_000_L2A_OE_f576f24d_RFL_ORT_QL.tif',
'https://data.ornldaac.earthdata.nasa.gov/protected/aviris/AV3_L2A_RFL/data/AV320240905t175818_000_L2A_OE_f576f24d.yaml',
'https://data.ornldaac.earthdata.nasa.gov/public/aviris/AV3_L2A_RFL/data/AV320240905t175818_000_L2A_OE_f576f24d_RFL_ORT_BROWSE.jpg',
'https://data.ornldaac.earthdata.nasa.gov/s3credentials']
Or we can grab just the reflectance .nc files.
'2024-09-05'][0], extension=".nc", contains="_RFL_ORT") get_asset_urls(results_by_date[
['https://data.ornldaac.earthdata.nasa.gov/protected/aviris/AV3_L2A_RFL/data/AV320240905t175818_000_L2A_OE_f576f24d_RFL_ORT.nc']
Note that since our extension and contains arguments are just using string matching, any string appeearing in all of the filenames, like RFL_ORT without an _ in front (like _RFL_ORT) will return several asset urls.
We can use this function on our lists of results to build a simple list of urls for the assets we want.
# Get Results for a specific Date
= [
aviris_links for result in results_by_date['2024-09-05']
url for url in get_asset_urls(result, extension=[".nc"])
]len(aviris_links)
574
5] aviris_links[:
['https://data.ornldaac.earthdata.nasa.gov/protected/aviris/AV3_L2A_RFL/data/AV320240905t175818_000_L2A_OE_f576f24d_RFL_ORT.nc',
'https://data.ornldaac.earthdata.nasa.gov/protected/aviris/AV3_L2A_RFL/data/AV320240905t175818_000_L2A_OE_f576f24d_UNC_ORT.nc',
'https://data.ornldaac.earthdata.nasa.gov/protected/aviris/AV3_L2A_RFL/data/AV320240905t175818_001_L2A_OE_f576f24d_RFL_ORT.nc',
'https://data.ornldaac.earthdata.nasa.gov/protected/aviris/AV3_L2A_RFL/data/AV320240905t175818_001_L2A_OE_f576f24d_UNC_ORT.nc',
'https://data.ornldaac.earthdata.nasa.gov/protected/aviris/AV3_L2A_RFL/data/AV320240905t175818_002_L2A_OE_f576f24d_RFL_ORT.nc']
Repeat this selection process for ECOSTRESS, retrieving the urls for the LST and cloud assets.
= [
eco_links for result in eco_results
url for url in get_asset_urls(result, contains=["_LST.", "_cloud"], extension=".tif")
]len(eco_links)
68
5] eco_links[:
['https://data.lpdaac.earthdatacloud.nasa.gov/lp-prod-protected/ECO_L2T_LSTE.002/ECOv002_L2T_LSTE_34798_012_11SMT_20240825T173046_0712_01/ECOv002_L2T_LSTE_34798_012_11SMT_20240825T173046_0712_01_cloud.tif',
'https://data.lpdaac.earthdatacloud.nasa.gov/lp-prod-protected/ECO_L2T_LSTE.002/ECOv002_L2T_LSTE_34798_012_11SMT_20240825T173046_0712_01/ECOv002_L2T_LSTE_34798_012_11SMT_20240825T173046_0712_01_LST.tif',
'https://data.lpdaac.earthdatacloud.nasa.gov/lp-prod-protected/ECO_L2T_LSTE.002/ECOv002_L2T_LSTE_34798_012_11SLT_20240825T173046_0712_01/ECOv002_L2T_LSTE_34798_012_11SLT_20240825T173046_0712_01_cloud.tif',
'https://data.lpdaac.earthdatacloud.nasa.gov/lp-prod-protected/ECO_L2T_LSTE.002/ECOv002_L2T_LSTE_34798_012_11SLT_20240825T173046_0712_01/ECOv002_L2T_LSTE_34798_012_11SLT_20240825T173046_0712_01_LST.tif',
'https://data.lpdaac.earthdatacloud.nasa.gov/lp-prod-protected/ECO_L2T_LSTE.002/ECOv002_L2T_LSTE_34885_005_11SLT_20240831T074811_0712_01/ECOv002_L2T_LSTE_34885_005_11SLT_20240831T074811_0712_01_cloud.tif']
Now repeat this process again for EMIT, selecting the reflectance and mask file urls.
= [
emit_links for result in emit_results
url for url in get_asset_urls(result, contains=["_RFL_", "MASK"],extension=".nc")
]len(emit_links)
6
emit_links
['https://data.lpdaac.earthdatacloud.nasa.gov/lp-prod-protected/EMITL2ARFL.001/EMIT_L2A_RFL_001_20240825T173115_2423811_008/EMIT_L2A_RFL_001_20240825T173115_2423811_008.nc',
'https://data.lpdaac.earthdatacloud.nasa.gov/lp-prod-protected/EMITL2ARFL.001/EMIT_L2A_RFL_001_20240825T173115_2423811_008/EMIT_L2A_MASK_001_20240825T173115_2423811_008.nc',
'https://data.lpdaac.earthdatacloud.nasa.gov/lp-prod-protected/EMITL2ARFL.001/EMIT_L2A_RFL_001_20240825T173127_2423811_009/EMIT_L2A_RFL_001_20240825T173127_2423811_009.nc',
'https://data.lpdaac.earthdatacloud.nasa.gov/lp-prod-protected/EMITL2ARFL.001/EMIT_L2A_RFL_001_20240825T173127_2423811_009/EMIT_L2A_MASK_001_20240825T173127_2423811_009.nc',
'https://data.lpdaac.earthdatacloud.nasa.gov/lp-prod-protected/EMITL2ARFL.001/EMIT_L2A_RFL_001_20240825T173104_2423811_007/EMIT_L2A_RFL_001_20240825T173104_2423811_007.nc',
'https://data.lpdaac.earthdatacloud.nasa.gov/lp-prod-protected/EMITL2ARFL.001/EMIT_L2A_RFL_001_20240825T173104_2423811_007/EMIT_L2A_MASK_001_20240825T173104_2423811_007.nc']
7. Downloading and Streaming Data
Depending on the objectives of your workflow and internet speed you can choose to download or stream the data. In this notebook we will show how to work with the data locally, outside of the Earthdata Cloud (AWS us-west2). For more about working with these datasets in the cloud, please reference How to: Directly Access ECOSTRESS Data (S3) for an example opening cloud-optimized geotiffs or How to: Use Direct S3 Access to work with EMIT Data for an example opening netCDF4 files.
Next, we’ll stream (directly access) data from AVIRIS-3 and ECOSTRESS datasets, then we’ll show how to and download EMIT data. A download-based workflow typically fits better for larger files or older formats when running analyses outside of the cloud.
First, import the libraries we’ll use to open the data. For EMIT we will use a module included in the repository to open and orthorectify the data.
import xarray as xr
import rasterio as rio
import rioxarray as rxr
import hvplot.xarray
from modules.emit_tools import emit_xarray
To access data from NASA, you’ll need to provide credentials. When streaming this can best be done using the token or cookies set up by the earthaccess
library. Since we’ve already logged in, we can start an fsspec
session to manage our connection to a remote file, including sending credentials. This allows other libraries to work with a URL as if it is a local file.
= earthaccess.get_fsspec_https_session() fs
From our results, we’ll first open an AVIRIS-3 file. These files are in hierarchical NetCDF4 format, meaning they have multiple groups with datasets within each. To open all of the groups, we’ll use the open_datatree
function from xarray
.
= fs.open(aviris_links[0])
aviris_file = xr.open_datatree(aviris_file)
aviris_ds aviris_ds
This opens the file lazily, meaning we haven’t loaded any data yet, just metadata. Before we load the data, we can subset the data or select only the groups we want data from. For example, below we grab the reflectance
variable from the reflectance
group, select a wavelength, and plot using hvplot
.
= aviris_ds['reflectance']['reflectance'].sel(wavelength=850, method='nearest').compute() aviris_850
="easting", y="northing", cmap='viridis', frame_width=750, aspect='equal') aviris_850.hvplot.image(x
Now, lets open an ECOSTRESS L2T LSTE file. These are provided as cloud-optimized geotiffs. To open these, we can similarly use a session to manage our remote connection to a url. For geotiff
files, the best performance for this is achieved by using a session from rasterio
. Often its best to include a retry and retry delay. This will prevent your workflow from breaking if there are connectivity issues.
# Cookies
= rio.Env(
env ="~/cookies.txt",
GDAL_HTTP_COOKIEFILE="~/cookies.txt",
GDAL_HTTP_COOKIEJAR="EMPTY_DIR",
GDAL_DISABLE_READDIR_ON_OPEN="10",
GDAL_HTTP_MAX_RETRY="0.5",
GDAL_HTTP_RETRY_DELAY )
Enter the rasterio session.
# Enter our Rasterio Session for the rest of the notebook
__enter__() env.
Open the file, then squeeze to remove the ‘band’ dimension from the array since we only have one band.
= rxr.open_rasterio(eco_links[1], mask_and_scale=True).squeeze('band', drop=True)
eco_ds eco_ds
Visualize using hvplot.image
.
='x',y='y',cmap='Spectral_r', aspect='equal', title="Surface Temperature (K)", frame_width=750) eco_ds.hvplot.image(x
# Exit rasterio session
__exit__ env.
Similar to the AVIRIS-3 data, EMIT data is in .netcdf4 format, so we can use an fsspec
session to stream the data. Unlike the AVIRIS-3 data, however, EMIT L2A Reflectance Version 1 data is not orthocorrected or chunked for streaming. This means its typically better to download this data unless you’re working in the cloud. First we’ll download a an EMIT scene from a our list, then we’ll open, orthorectify the data, and visualize it.
Import the download_granules
function from modules.search_util
. Then create a list containing a single url and use the function to download the scene.
from modules.search_util import download_granules
= [emit_links[0]]
single_emit_url_list =single_emit_url_list,output_directory='../data/') download_granules(url_list
Now get the filepath to the local copy of the EMIT L2A RFL scene.
= f'../data/{single_emit_url_list[0].split("/")[-1]}' emit_file
Open using the emit_xarray
function, with the ortho=True
argument to orthorectify the image.
= emit_xarray(emit_file, ortho=True)
emit_ds emit_ds
# Set fill values to np.nan
'reflectance'].data[emit_ds['reflectance'].data==-9999] = np.nan emit_ds[
Select the band closest to 850nm and visualize using hvplot
like we did for AVIRIS-3 Data.
= emit_ds['reflectance'].sel(wavelengths=850,method='nearest') emit_850
="longitude", y="latitude", cmap='viridis', frame_width=750, aspect='equal') emit_850.hvplot.image(x
Contact Info:
Email: LPDAAC@usgs.gov
Voice: +1-866-573-3222
Organization: Land Processes Distributed Active Archive Center (LP DAAC)¹
Website: https://www.earthdata.nasa.gov/centers/lp-daac
¹Work performed under USGS contract 140G0121D0001 for NASA contract NNG14HH33I.