import earthaccess
import os
import warnings
import csv
from osgeo import gdal
import numpy as np
import math
import rasterio as rio
import xarray as xr
import holoviews as hv
import hvplot.xarray
import netCDF4 as nc
# This will ignore some warnings caused by holoviews
'ignore') warnings.simplefilter(
This notebook is from EMIT-Data-Resources
Source: Exploring EMIT L2A Reflectance
Imported on: 2025-01-22
Exploring L2A Reflectance
Summary
In this notebook we will open a Level 2A (L2A) Reflectance product file (netcdf) from Earth Surface Mineral Dust Source Investigation (EMIT). We will inspect the structure and plot the spectra of individual pixels and spatial coverage of a single band. Next, we will orthorectify the imagery using the included geometry lookup table (GLT). Finally, we will take advantage of the holoviews streams
module to build an interactive plot.
Background
The EMIT instrument is an imaging spectrometer that measures light in visible and infrared wavelengths. These measurements display unique spectral signatures that correspond to the composition on the Earth’s surface. The EMIT mission focuses specifically on mapping the composition of minerals to better understand the effects of mineral dust throughout the Earth system and human populations now and in the future. More details about EMIT and its associated products can be found in the README.md and on the EMIT website.
The L2A Reflectance Product contains estimated surface reflectance. Surface reflectance is the fraction of incoming solar radiation reflected Earth’s surface. Materials reflect proportions of radiation differently based upon their chemical composition. This means that reflectance information can be used to determine the composition of a target. In this guide you will learn how to plot a band from the L2A reflectance spatially and look at the spectral curve associated with individual pixels.
References - Leith, Alex. 2023. Hyperspectral Notebooks. Jupyter Notebook. Auspatious. https://github.com/auspatious/hyperspectral-notebooks/tree/main
Requirements - Set up Python Environment - See setup_instructions.md in the /setup/
folder
Learning Objectives
- How to open an EMIT .nc
file as an xarray.Dataset
- Apply the Geometry Lookup Table (GLT) to orthorectify the image. - How to plot the spectra of pixels - How to plot specific bands as images - How to make an interactive plot to visualize spectra
Tutorial Outline
1.1 Setup
1.2 Opening The Data
1.3 Plotting Data - Non-Orthorectified
1.4 Orthorectification
1.5 Plotting Data - Orthorectified
1.6 Saving Orthorectified Data
1.7 Interactive Plots
1.1 Setup
Import the required Python libraries.
Login to your NASA Earthdata account and create a .netrc
file using the login
function from the earthaccess
library. If you do not have an Earthdata Account, you can create one here.
=True) earthaccess.login(persist
For this notebook we will download the files necessary using earthaccess
. You can also access the data in place or stream it, but this can slow due to the file sizes. Provide a URL for an EMIT L2A Reflectance granule.
= 'https://data.lpdaac.earthdatacloud.nasa.gov/lp-prod-protected/EMITL2ARFL.001/EMIT_L2A_RFL_001_20220903T163129_2224611_012/EMIT_L2A_RFL_001_20220903T163129_2224611_012.nc' url
Get an HTTPS Session using your earthdata login, set a local path to save the file, and download the granule asset - This may take a while, the reflectance file is approximately 1.8 GB.
# Get requests https Session using Earthdata Login Info
= earthaccess.get_requests_https_session()
fs # Retrieve granule asset ID from URL (to maintain existing naming convention)
= url.split('/')[-1]
granule_asset_id # Define Local Filepath
= f'../../data/{granule_asset_id}'
fp # Download the Granule Asset if it doesn't exist
if not os.path.isfile(fp):
with fs.get(url,stream=True) as src:
with open(fp,'wb') as dst:
for chunk in src.iter_content(chunk_size=64*1024*1024):
dst.write(chunk)
1.2 Opening EMIT Data
EMIT L2A Reflectance Data are distributed in a non-orthocorrected spatially raw NetCDF4 (.nc) format consisting of the data and its associated metadata. Inside the L2A Reflectance .nc
file there are 3 groups. Groups can be thought of as containers to organize the data.
- The root group that can be considered the main dataset contains the reflectance data described by the downtrack, crosstrack, and bands dimensions.
- The
sensor_band_parameters
group containing the wavelength center and the full-width half maximum (FWHM) of each band.
- The
location
group contains latitude and longitude values at the center of each pixel described by the crosstrack and downtrack dimensions, as well as a geometry lookup table (GLT) described by the ortho_x and ortho_y dimensions. The GLT is an orthorectified image (EPSG:4326) consisting of 2 layers containing downtrack and crosstrack indices. These index positions allow us to quickly project the raw data onto this geographic grid.
To access the .nc
file we will use the netCDF4
and xarray
libraries. The netCDF4
library will be used to explore thee data structure, then we will use xarray
to work with the data. xarray
is a python package for working with labelled multi-dimensional arrays. It provides a data model where data, dimensions, and attributes together in an easily interpretable way.
= nc.Dataset(fp)
ds_nc ds_nc
'location'] ds_nc[
From this output, we can see the reflectance
variable, and the sensor_band_parameters
and location
groups. We can also see the dimensions, their sizes, and file metadata.
Now that we have a better understanding of the structure of the file, read the EMIT data as an xarray.Dataset
and preview it.
= xr.open_dataset(fp)
ds ds
This xarray
dataset only contains the reflectance
variable and attributes metadata, not the data from the other groups in the file. This is because xarray
only supports reading non-hierarchical (flat) datasets, meaning that when loading a NetCDF into an xarray.Dataset
, only the root group is added. The other groups will have to be read into xarray
separately. We can list them using the netCDF4
library to get the group names, then use that to add them to new xarray
datasets.
ds_nc.groups.keys()
Now that we know the other group names, read the sensor_band_parameters
and location
groups into their own xarray
datasets.
= xr.open_dataset(fp,group='sensor_band_parameters')
wvl wvl
= xr.open_dataset(fp,group='location')
loc loc
We could merge all 3 datasets, but since sensor_band_parameters
and location
describe various aspects of the reflectance
variable we can simply add them as coordinates, along with a downtrack and crosstrack dimension to describe the reflectance data array. This will allow us to utilize some additional features of xarray
.
# Create coordinates and an index for the downtrack and crosstrack dimensions, then unpack the variables from the wvl and loc datasets and set them as coordinates for ds
= ds.assign_coords({'downtrack':(['downtrack'], ds.downtrack.data),'crosstrack':(['crosstrack'],ds.crosstrack.data), **wvl.variables, **loc.variables})
ds ds
Another step we can take is to swap the ‘bands’ dimension with wavelengths. Doing this will allow us to index based on the wavelength of the band, and remove ‘bands’ as a dimension. We can do this since bands is just a 3rd dimension that will is defined based on the ‘sensor_band_parameters’ group (i.e. ‘wavelengths’ for reflectance, or ‘mask_bands’ for mask data).
= ds.swap_dims({'bands':'wavelengths'})
ds ds
Now we have an xarray.Dataset
containing all of the information from EMIT netCDF file. Since these datasets are large, we can go ahead and delete objects we won’t be using to conserve memory.
del wvl
del loc
1.3 Visualizing Spectra - Non-Orthorectified
Pick a random downtrack and crosstrack location. Here we chose 660, 370 (downtrack,crosstrack). Next use the sel()
function from xarray
and the hvplot.line()
functions to first select the spatial position and then plot a line showing the reflectance at that location.
= ds['reflectance'].sel(downtrack=660,crosstrack=370)
example ='reflectance',x='wavelengths', color='black', frame_height=400, frame_width=600) example.hvplot.line(y
We can see some flat regions in the spectral curve around 1320-1440 nm and 1770-1970 nm. These are where water absoption features in these regions were removed. Typically this data is noisy due to the moisture present in the atmosphere; therefore, these spectral regions offer little information about targets and can be excluded from calculations.
We can set reflectance values where the good_wavelenghts
is 0 (these will have a reflectance of -0.1) to np.nan
do mask them out and improve visualization.
'reflectance'].data[:,:,ds['good_wavelengths'].data==0] = np.nan ds[
Plot the filtered reflectance values using the same downtrack and crosstrack position as above.
'reflectance'].sel(downtrack=660,crosstrack=370).hvplot.line(y='reflectance',x='wavelengths', color='black', frame_height=400, frame_width=600) ds[
Without these data we can better interpret the spectral curve and hvplot
will do a better job automatically scaling our axes.
We can also plot the data spatially. Since we changed our dimension and index to wavelengths
we can use the sel()
function to spectrally subset to the wavelength nearest to 850nm in the NIR, then plot the data spatially using hvplot.image()
to view the reflectance at 850nm of each pixel across the acquired region.
= ds.sel(wavelengths=850, method='nearest') refl850
='viridis', aspect = 'equal', frame_width=720).opts(title=f"{refl850.wavelengths.values:.3f} {refl850.wavelengths.units}") refl850.hvplot.image(cmap
1.4 Orthorectification
The ‘real’ orthorectifation process has already been done for EMIT data. Here we are using the crosstrack and downtrack indices contained in the GLT to place our spatially raw reflectance data a into geographic grid with the ortho_x
and ortho_y
dimensions. As previously mentioned a Geometry Lookup Table (GLT) is included in the location
group of the netCDF4 file. Applying the GLT will orthorectify the data and give us Latitude and Longitude positional information.
Before using the GLT to orthorectify the data, examine the location
group from the dataset by reading it into xarray
.
= xr.open_dataset(fp,group='location')
loc loc
We can see that each downtrack and crosstrack position has a latitude, longitude, and elevation, and the ortho_x and ortho_y data make up glt_x and glt_y arrays with a different shape. These arrays contain crosstrack and downtrack index values to quickly reproject the data. We will use these indexes to build an array of 2009x2353x285 (lat,lon,bands), filling it with the data from the reflectance dataset.
Go ahead and remove this dataset. We will use a function in the provided emit_tools
module to orthorectify the data and place it into an xarray.Dataset
.
del loc
del example
Import the emit_tools
module and call use the help function to see how it can be used.
Note: This function currently works with L1B Radiance and L2A Reflectance Data.
import sys
'../modules/')
sys.path.append(from emit_tools import emit_xarray
help(emit_xarray)
We can see that the emit_xarray
function will automatically apply the GLT to orthorectify the data unless ortho = False
. The function will also apply masks if desired during construction of the output xarray.Dataset
. EMIT L2A Masks files provides a quality mask and a band_mask indicating if values were interpolated. For more about masking, see the How_to_use_EMIT_Quality_data.ipynb.
Use the emit_xarray
function to read in and orthorectify the L2A reflectance data.
For a detailed walkthrough of the orthorectification process using the GLT see section 2 of the How_to_Orthorectify.ipynb in the how-tos folder.
= emit_xarray(fp, ortho=True)
ds_geo ds_geo
= ds.sel(wavelengths=850, method='nearest').hvplot.image(cmap='Viridis', aspect = 'equal',frame_height=600).opts(
non_ortho_fig =f"Reflectance at {refl850.wavelengths.values:.3f} {refl850.wavelengths.units}") title
When we orthorectify the scene, locations in the grid without data will be filled with the default fill value of -9999. To improve visualizations we can assign these locations to np.nan
to mask them (make them transparent).
== -9999] = np.nan ds_geo.reflectance.data[ds_geo.reflectance.data
=850, method='nearest').hvplot.image(cmap='Viridis', geo=True, tiles='ESRI', alpha=0.8, frame_height=600).opts(
ds_geo.sel(wavelengths=f"Reflectance at {refl850.wavelengths.values:.3f} {refl850.wavelengths.units} (Orthorectified)") title
1.5 Plotting Data - Orthorectified
Now that the data has been orthorectified, plot the georeferenced dataset using the same single wavelength (850nm) as above alongside the uncorrected image. We an also plot the orthorectified data against an imagery tile using the geo=True
and tiles=
parameters instead of aspect='equal'
. Any tile source available in geoviews
should work here. This will change the axis names, but that can be fixed by adding them manually in the opts
, like below.
=850, method='nearest').hvplot.image(cmap='Viridis', geo=True, tiles='ESRI', alpha=0.8, frame_height=600).opts(
(ds_geo.sel(wavelengths=f"Reflectance at {refl850.wavelengths.values:.3f} {refl850.wavelengths.units} (Orthorectified)") +\
title=850, method='nearest').hvplot.image(cmap='Viridis', aspect = 'equal',frame_height=600).opts(
ds.sel(wavelengths=f"Reflectance at {refl850.wavelengths.values:.3f} {refl850.wavelengths.units}")) title
We can see that the orthorectification step placed the data on a geogrpahic grid that matches pretty well with ESRI tiles. Now that we have a better idea of what the target area looks like, we can also plot the spectra using the georeferenced data. First, filter out the water absorption bands like we did earlier. By limiting the third dimension of the array to good_wavelengths
.
'reflectance'].data[:,:,ds_geo['good_wavelengths'].data==0] = np.nan ds_geo[
Now, plot the spectra at the Lat/Lon coordinates provided below.
= ds_geo.sel(longitude=-61.833,latitude=-39.710,method='nearest')
point ='reflectance',x='wavelengths', color='black', frame_height=400, frame_width=600).opts(
point.hvplot.line(y= f'Latitude = {point.latitude.values.round(3)}, Longitude = {point.longitude.values.round(3)}') title
1.6 Writing an Orthorectified Output
At this point, the ds_geo
orthorectified EMIT data can also be written as a flattened netCDF4 output that can be read using the xarray.open_dataset
function, if desired. Before doing this, we can transpose the dimension order so the bands are the first dimension so that the data is readable by software like QGIS. This file will be larger than the original EMIT granule since it has been orthorectified. If we do write an output file, the format will be such that it can be read in using xarray
.
Transpose the dimensions order and write an output.
# Transpose dimensions
# ds_geo = ds_geo.transpose('wavelengths','latitude','longitude')
# ds_geo.to_netcdf('../../data/geo_ds_out.nc')
# Example for Opening
# ds = xr.open_dataset('../data/geo_ds_out.nc')
1.7 Interactive Spatial and Spectral Plots
Combining the Spatial and Spectral information into a single visualization can be a powerful tool for exploring and inspecting imaging spectroscopy data. Using the streams
module from Holoviews
we can link a spatial map to a plot of spectra.
We could plot a single band image as we previously have, but using a multiband image, like an RGB may help infer what targets we’re examining. Build an RGB image following the steps below.
Select bands to represent red (650 nm), green (560 nm), and blue (470 nm) by finding the nearest to a wavelength chosen to represent that color.
Note that if subsetting by bands like this example, it is more memory efficient to subset before orthorectifying. Instead of using
ortho=True
in theemit_xarray
function, select bands first, then apply the orthorectification using theortho_xr
function fromemit_tools.py
(requires a separate import).
= ds_geo.sel(wavelengths=[650, 560, 470], method='nearest')
rgb rgb
Next, write a function to scale the values using a gamma correction. Without applying this scaling the majority of the image would be very dark, with the reflectance data being skewed by the few pixels with very high reflectance. > Note: This has no impact on analysis or data, just visualizing the RGB map.
# Function to adjust gamma across all bands - adjust brightness
def gamma_adjust(rgb_ds, bright=0.2, white_background=False):
= rgb_ds.reflectance.data
array = math.log(bright)/math.log(np.nanmean(array)) # Create exponent for gamma scaling - can be adjusted by changing 0.2
gamma = np.power(array,gamma).clip(0,1) # Apply scaling and clip to 0-1 range
scaled if white_background == True:
= np.nan_to_num(scaled, nan = 1) # Assign NA's to 1 so they appear white in plots
scaled = scaled
rgb_ds.reflectance.data return rgb_ds
= gamma_adjust(rgb, white_background=True) rgb
Now that we have an RGB dataset, use it to build our spatial plot.
map = rgb.hvplot.rgb(x='longitude', y='latitude', bands='wavelengths', aspect = 'equal', frame_height=500)
To visualize the spectral and spatial data side-by-side, we use the Point Draw
tool from the holoviews
library.
Define a limit to the quantity of points and spectra we will plot, a list of colors to cycle through, and an initial point. We use the input from the PointerXY
to show the spectra where our mouse cursor is. Then use the input from the Tap
function to provide clicked x and y positions on the map
. These retrieve spectra from the dataset at those coordinates.
Click in the RGB image to add spectra to the plot. You can also click and hold the mouse button then drag previously place points. To remove a point click and hold the mouse button down, then press the backspace key.
# Set Point Limit
= 10
POINT_LIMIT
# Set up Color Cycling
= hv.Cycle('Category20')
color_cycle = [color_cycle[i] for i in range(5)]
colors
# Get center coordinates of image
= ds_geo.longitude.values[int(len(ds_geo.longitude) / 2)]
xmid = ds_geo.latitude.values[int(len(ds_geo.latitude) / 2)]
ymid
#
= ([xmid], [ymid], [0])
first_point = hv.Points(first_point, vdims='id')
points = hv.streams.PointDraw(
points_stream =points.columns(),
data=points,
source=True,
drag=POINT_LIMIT,
num_objects={'fill_color': color_cycle.values[1:POINT_LIMIT+1], 'line_color': 'gray'}
styles
)
= hv.streams.PointerXY(source=map, x=xmid, y=ymid)
posxy = hv.streams.Tap(source=map, x=xmid, y=ymid)
clickxy
# Function to build spectral plot of clicked location to show on hover stream plot
def click_spectra(data):
= []
coordinates if data is None or not any(len(d) for d in data.values()):
0][0], clicked_points[1][0])
coordinates.append(clicked_points[else:
= [c for c in zip(data['x'], data['y'])]
coordinates
= []
plots for i, coords in enumerate(coordinates):
= coords
x, y = ds_geo.sel(longitude=x, latitude=y, method="nearest")
data
plots.append(
data.hvplot.line(="reflectance",
y="wavelengths",
x=color_cycle,
color=f"{i}"
label
)
)"id"][i] = i
points_stream.data[return hv.Overlay(plots)
def hover_spectra(x,y):
return ds_geo.sel(longitude=x,latitude=y,method='nearest').hvplot.line(y='reflectance',x='wavelengths',
='black', frame_width=400)
color# Define the Dynamic Maps
= hv.DynamicMap(click_spectra, streams=[points_stream])
click_dmap = hv.DynamicMap(hover_spectra, streams=[posxy])
hover_dmap
# Plot the Map and Dynamic Map side by side
*click_dmap + map * points).cols(2).opts(
hv.Layout(hover_dmap=['point_draw'], size=10, tools=['hover'], color='white', line_color='gray'),
hv.opts.Points(active_tools=False, show_title=False, fontscale=1.5, frame_height=480)
hv.opts.Overlay(show_legend )
After selecting a number of points we can build a dictionary of points and spectra, then export the spectra to a .csv
file.
= points_stream.data
data = ds_geo.wavelengths.values
wavelengths
= [["id", "x", "y"] + [str(i) for i in wavelengths]]
rows
for p in zip(data['x'], data['y'], data['id']):
= p
x, y, i = ds_geo.sel(longitude=x, latitude=y, method="nearest").reflectance.values
spectra = [i, x, y] + list(spectra)
row rows.append(row)
with open('../../data/interactive_plot_data.csv', 'w', newline='') as f:
= csv.writer(f)
writer writer.writerows(rows)
Contact Info:
Email: LPDAAC@usgs.gov
Voice: +1-866-573-3222
Organization: Land Processes Distributed Active Archive Center (LP DAAC)¹
Website: https://lpdaac.usgs.gov/
Date last modified: 11-27-2023
¹Work performed under USGS contract G15PD00467 for NASA contract NNG14HH33I.