This notebook is from EMIT-Data-Resources

Source: Exploring EMIT L2A Reflectance

Imported on: 2025-01-22

Exploring L2A Reflectance

Summary

In this notebook we will open a Level 2A (L2A) Reflectance product file (netcdf) from Earth Surface Mineral Dust Source Investigation (EMIT). We will inspect the structure and plot the spectra of individual pixels and spatial coverage of a single band. Next, we will orthorectify the imagery using the included geometry lookup table (GLT). Finally, we will take advantage of the holoviews streams module to build an interactive plot.

Background

The EMIT instrument is an imaging spectrometer that measures light in visible and infrared wavelengths. These measurements display unique spectral signatures that correspond to the composition on the Earth’s surface. The EMIT mission focuses specifically on mapping the composition of minerals to better understand the effects of mineral dust throughout the Earth system and human populations now and in the future. More details about EMIT and its associated products can be found in the README.md and on the EMIT website.

The L2A Reflectance Product contains estimated surface reflectance. Surface reflectance is the fraction of incoming solar radiation reflected Earth’s surface. Materials reflect proportions of radiation differently based upon their chemical composition. This means that reflectance information can be used to determine the composition of a target. In this guide you will learn how to plot a band from the L2A reflectance spatially and look at the spectral curve associated with individual pixels.

References - Leith, Alex. 2023. Hyperspectral Notebooks. Jupyter Notebook. Auspatious. https://github.com/auspatious/hyperspectral-notebooks/tree/main

Requirements - Set up Python Environment - See setup_instructions.md in the /setup/ folder

Learning Objectives
- How to open an EMIT .nc file as an xarray.Dataset - Apply the Geometry Lookup Table (GLT) to orthorectify the image. - How to plot the spectra of pixels - How to plot specific bands as images - How to make an interactive plot to visualize spectra

Tutorial Outline

1.1 Setup
1.2 Opening The Data
1.3 Plotting Data - Non-Orthorectified
1.4 Orthorectification
1.5 Plotting Data - Orthorectified
1.6 Saving Orthorectified Data
1.7 Interactive Plots

Interactive Plot

1.1 Setup

Import the required Python libraries.

import earthaccess
import os
import warnings
import csv
from osgeo import gdal
import numpy as np
import math
import rasterio as rio
import xarray as xr
import holoviews as hv
import hvplot.xarray
import netCDF4 as nc

# This will ignore some warnings caused by holoviews
warnings.simplefilter('ignore') 

Login to your NASA Earthdata account and create a .netrc file using the login function from the earthaccess library. If you do not have an Earthdata Account, you can create one here.

earthaccess.login(persist=True)

For this notebook we will download the files necessary using earthaccess. You can also access the data in place or stream it, but this can slow due to the file sizes. Provide a URL for an EMIT L2A Reflectance granule.

url = 'https://data.lpdaac.earthdatacloud.nasa.gov/lp-prod-protected/EMITL2ARFL.001/EMIT_L2A_RFL_001_20220903T163129_2224611_012/EMIT_L2A_RFL_001_20220903T163129_2224611_012.nc'

Get an HTTPS Session using your earthdata login, set a local path to save the file, and download the granule asset - This may take a while, the reflectance file is approximately 1.8 GB.

# Get requests https Session using Earthdata Login Info
fs = earthaccess.get_requests_https_session()
# Retrieve granule asset ID from URL (to maintain existing naming convention)
granule_asset_id = url.split('/')[-1]
# Define Local Filepath
fp = f'../../data/{granule_asset_id}'
# Download the Granule Asset if it doesn't exist
if not os.path.isfile(fp):
    with fs.get(url,stream=True) as src:
        with open(fp,'wb') as dst:
            for chunk in src.iter_content(chunk_size=64*1024*1024):
                dst.write(chunk)

1.2 Opening EMIT Data

EMIT L2A Reflectance Data are distributed in a non-orthocorrected spatially raw NetCDF4 (.nc) format consisting of the data and its associated metadata. Inside the L2A Reflectance .nc file there are 3 groups. Groups can be thought of as containers to organize the data.

  1. The root group that can be considered the main dataset contains the reflectance data described by the downtrack, crosstrack, and bands dimensions.
  2. The sensor_band_parameters group containing the wavelength center and the full-width half maximum (FWHM) of each band.
  3. The location group contains latitude and longitude values at the center of each pixel described by the crosstrack and downtrack dimensions, as well as a geometry lookup table (GLT) described by the ortho_x and ortho_y dimensions. The GLT is an orthorectified image (EPSG:4326) consisting of 2 layers containing downtrack and crosstrack indices. These index positions allow us to quickly project the raw data onto this geographic grid.

To access the .nc file we will use the netCDF4 and xarray libraries. The netCDF4 library will be used to explore thee data structure, then we will use xarray to work with the data. xarray is a python package for working with labelled multi-dimensional arrays. It provides a data model where data, dimensions, and attributes together in an easily interpretable way.

ds_nc = nc.Dataset(fp)
ds_nc
ds_nc['location']

From this output, we can see the reflectance variable, and the sensor_band_parameters and location groups. We can also see the dimensions, their sizes, and file metadata.

Now that we have a better understanding of the structure of the file, read the EMIT data as an xarray.Dataset and preview it.

ds = xr.open_dataset(fp)
ds

This xarray dataset only contains the reflectance variable and attributes metadata, not the data from the other groups in the file. This is because xarray only supports reading non-hierarchical (flat) datasets, meaning that when loading a NetCDF into an xarray.Dataset, only the root group is added. The other groups will have to be read into xarray separately. We can list them using the netCDF4 library to get the group names, then use that to add them to new xarray datasets.

ds_nc.groups.keys()

Now that we know the other group names, read the sensor_band_parameters and location groups into their own xarray datasets.

wvl = xr.open_dataset(fp,group='sensor_band_parameters')
wvl
loc = xr.open_dataset(fp,group='location')
loc

We could merge all 3 datasets, but since sensor_band_parameters and location describe various aspects of the reflectance variable we can simply add them as coordinates, along with a downtrack and crosstrack dimension to describe the reflectance data array. This will allow us to utilize some additional features of xarray.

# Create coordinates and an index for the downtrack and crosstrack dimensions, then unpack the variables from the wvl and loc datasets and set them as coordinates for ds
ds = ds.assign_coords({'downtrack':(['downtrack'], ds.downtrack.data),'crosstrack':(['crosstrack'],ds.crosstrack.data), **wvl.variables, **loc.variables})
ds

Another step we can take is to swap the ‘bands’ dimension with wavelengths. Doing this will allow us to index based on the wavelength of the band, and remove ‘bands’ as a dimension. We can do this since bands is just a 3rd dimension that will is defined based on the ‘sensor_band_parameters’ group (i.e. ‘wavelengths’ for reflectance, or ‘mask_bands’ for mask data).

ds = ds.swap_dims({'bands':'wavelengths'})
ds

Now we have an xarray.Dataset containing all of the information from EMIT netCDF file. Since these datasets are large, we can go ahead and delete objects we won’t be using to conserve memory.

del wvl
del loc

1.3 Visualizing Spectra - Non-Orthorectified

Pick a random downtrack and crosstrack location. Here we chose 660, 370 (downtrack,crosstrack). Next use the sel() function from xarray and the hvplot.line() functions to first select the spatial position and then plot a line showing the reflectance at that location.

example = ds['reflectance'].sel(downtrack=660,crosstrack=370)
example.hvplot.line(y='reflectance',x='wavelengths', color='black', frame_height=400, frame_width=600)

We can see some flat regions in the spectral curve around 1320-1440 nm and 1770-1970 nm. These are where water absoption features in these regions were removed. Typically this data is noisy due to the moisture present in the atmosphere; therefore, these spectral regions offer little information about targets and can be excluded from calculations.

We can set reflectance values where the good_wavelenghts is 0 (these will have a reflectance of -0.1) to np.nan do mask them out and improve visualization.

ds['reflectance'].data[:,:,ds['good_wavelengths'].data==0] = np.nan

Plot the filtered reflectance values using the same downtrack and crosstrack position as above.

ds['reflectance'].sel(downtrack=660,crosstrack=370).hvplot.line(y='reflectance',x='wavelengths', color='black', frame_height=400, frame_width=600)

Without these data we can better interpret the spectral curve and hvplot will do a better job automatically scaling our axes.

We can also plot the data spatially. Since we changed our dimension and index to wavelengths we can use the sel() function to spectrally subset to the wavelength nearest to 850nm in the NIR, then plot the data spatially using hvplot.image() to view the reflectance at 850nm of each pixel across the acquired region.

refl850 = ds.sel(wavelengths=850, method='nearest')
refl850.hvplot.image(cmap='viridis', aspect = 'equal', frame_width=720).opts(title=f"{refl850.wavelengths.values:.3f} {refl850.wavelengths.units}")

1.4 Orthorectification

The ‘real’ orthorectifation process has already been done for EMIT data. Here we are using the crosstrack and downtrack indices contained in the GLT to place our spatially raw reflectance data a into geographic grid with the ortho_x and ortho_y dimensions. As previously mentioned a Geometry Lookup Table (GLT) is included in the location group of the netCDF4 file. Applying the GLT will orthorectify the data and give us Latitude and Longitude positional information.

Before using the GLT to orthorectify the data, examine the location group from the dataset by reading it into xarray.

loc = xr.open_dataset(fp,group='location')
loc

We can see that each downtrack and crosstrack position has a latitude, longitude, and elevation, and the ortho_x and ortho_y data make up glt_x and glt_y arrays with a different shape. These arrays contain crosstrack and downtrack index values to quickly reproject the data. We will use these indexes to build an array of 2009x2353x285 (lat,lon,bands), filling it with the data from the reflectance dataset.

Go ahead and remove this dataset. We will use a function in the provided emit_tools module to orthorectify the data and place it into an xarray.Dataset.

del loc
del example

Import the emit_tools module and call use the help function to see how it can be used.

Note: This function currently works with L1B Radiance and L2A Reflectance Data.

import sys
sys.path.append('../modules/')
from emit_tools import emit_xarray
help(emit_xarray)

We can see that the emit_xarray function will automatically apply the GLT to orthorectify the data unless ortho = False. The function will also apply masks if desired during construction of the output xarray.Dataset. EMIT L2A Masks files provides a quality mask and a band_mask indicating if values were interpolated. For more about masking, see the How_to_use_EMIT_Quality_data.ipynb.

Use the emit_xarray function to read in and orthorectify the L2A reflectance data.

For a detailed walkthrough of the orthorectification process using the GLT see section 2 of the How_to_Orthorectify.ipynb in the how-tos folder.

ds_geo = emit_xarray(fp, ortho=True)
ds_geo
non_ortho_fig = ds.sel(wavelengths=850, method='nearest').hvplot.image(cmap='Viridis', aspect = 'equal',frame_height=600).opts(
         title=f"Reflectance at {refl850.wavelengths.values:.3f} {refl850.wavelengths.units}")

When we orthorectify the scene, locations in the grid without data will be filled with the default fill value of -9999. To improve visualizations we can assign these locations to np.nan to mask them (make them transparent).

ds_geo.reflectance.data[ds_geo.reflectance.data == -9999] = np.nan
ds_geo.sel(wavelengths=850, method='nearest').hvplot.image(cmap='Viridis', geo=True, tiles='ESRI', alpha=0.8, frame_height=600).opts(
    title=f"Reflectance at {refl850.wavelengths.values:.3f} {refl850.wavelengths.units} (Orthorectified)")

1.5 Plotting Data - Orthorectified

Now that the data has been orthorectified, plot the georeferenced dataset using the same single wavelength (850nm) as above alongside the uncorrected image. We an also plot the orthorectified data against an imagery tile using the geo=True and tiles= parameters instead of aspect='equal'. Any tile source available in geoviews should work here. This will change the axis names, but that can be fixed by adding them manually in the opts, like below.

(ds_geo.sel(wavelengths=850, method='nearest').hvplot.image(cmap='Viridis', geo=True, tiles='ESRI', alpha=0.8, frame_height=600).opts(
    title=f"Reflectance at {refl850.wavelengths.values:.3f} {refl850.wavelengths.units} (Orthorectified)") +\
     ds.sel(wavelengths=850, method='nearest').hvplot.image(cmap='Viridis', aspect = 'equal',frame_height=600).opts(
         title=f"Reflectance at {refl850.wavelengths.values:.3f} {refl850.wavelengths.units}"))

We can see that the orthorectification step placed the data on a geogrpahic grid that matches pretty well with ESRI tiles. Now that we have a better idea of what the target area looks like, we can also plot the spectra using the georeferenced data. First, filter out the water absorption bands like we did earlier. By limiting the third dimension of the array to good_wavelengths.

ds_geo['reflectance'].data[:,:,ds_geo['good_wavelengths'].data==0] = np.nan

Now, plot the spectra at the Lat/Lon coordinates provided below.

point = ds_geo.sel(longitude=-61.833,latitude=-39.710,method='nearest')
point.hvplot.line(y='reflectance',x='wavelengths', color='black', frame_height=400, frame_width=600).opts(
    title = f'Latitude = {point.latitude.values.round(3)}, Longitude = {point.longitude.values.round(3)}')

1.6 Writing an Orthorectified Output

At this point, the ds_geo orthorectified EMIT data can also be written as a flattened netCDF4 output that can be read using the xarray.open_dataset function, if desired. Before doing this, we can transpose the dimension order so the bands are the first dimension so that the data is readable by software like QGIS. This file will be larger than the original EMIT granule since it has been orthorectified. If we do write an output file, the format will be such that it can be read in using xarray.

Transpose the dimensions order and write an output.

# Transpose dimensions
# ds_geo = ds_geo.transpose('wavelengths','latitude','longitude')
# ds_geo.to_netcdf('../../data/geo_ds_out.nc')

# Example for Opening 
# ds = xr.open_dataset('../data/geo_ds_out.nc')

1.7 Interactive Spatial and Spectral Plots

Combining the Spatial and Spectral information into a single visualization can be a powerful tool for exploring and inspecting imaging spectroscopy data. Using the streams module from Holoviews we can link a spatial map to a plot of spectra.

We could plot a single band image as we previously have, but using a multiband image, like an RGB may help infer what targets we’re examining. Build an RGB image following the steps below.

Select bands to represent red (650 nm), green (560 nm), and blue (470 nm) by finding the nearest to a wavelength chosen to represent that color.

Note that if subsetting by bands like this example, it is more memory efficient to subset before orthorectifying. Instead of using ortho=True in the emit_xarray function, select bands first, then apply the orthorectification using the ortho_xr function from emit_tools.py (requires a separate import).

rgb = ds_geo.sel(wavelengths=[650, 560, 470], method='nearest')
rgb

Next, write a function to scale the values using a gamma correction. Without applying this scaling the majority of the image would be very dark, with the reflectance data being skewed by the few pixels with very high reflectance. > Note: This has no impact on analysis or data, just visualizing the RGB map.

# Function to adjust gamma across all bands - adjust brightness
def gamma_adjust(rgb_ds, bright=0.2, white_background=False):
    array = rgb_ds.reflectance.data
    gamma = math.log(bright)/math.log(np.nanmean(array)) # Create exponent for gamma scaling - can be adjusted by changing 0.2 
    scaled = np.power(array,gamma).clip(0,1) # Apply scaling and clip to 0-1 range
    if white_background == True:
        scaled = np.nan_to_num(scaled, nan = 1) # Assign NA's to 1 so they appear white in plots
    rgb_ds.reflectance.data = scaled
    return rgb_ds
rgb = gamma_adjust(rgb, white_background=True)

Now that we have an RGB dataset, use it to build our spatial plot.

map = rgb.hvplot.rgb(x='longitude', y='latitude', bands='wavelengths', aspect = 'equal', frame_height=500)

To visualize the spectral and spatial data side-by-side, we use the Point Draw tool from the holoviews library.

Define a limit to the quantity of points and spectra we will plot, a list of colors to cycle through, and an initial point. We use the input from the PointerXYto show the spectra where our mouse cursor is. Then use the input from the Tap function to provide clicked x and y positions on the map. These retrieve spectra from the dataset at those coordinates.

Click in the RGB image to add spectra to the plot. You can also click and hold the mouse button then drag previously place points. To remove a point click and hold the mouse button down, then press the backspace key.

# Set Point Limit
POINT_LIMIT = 10

# Set up  Color Cycling
color_cycle = hv.Cycle('Category20')
colors = [color_cycle[i] for i in range(5)]

# Get center coordinates of image
xmid = ds_geo.longitude.values[int(len(ds_geo.longitude) / 2)]
ymid = ds_geo.latitude.values[int(len(ds_geo.latitude) / 2)]

#
first_point = ([xmid], [ymid], [0])
points = hv.Points(first_point, vdims='id')
points_stream = hv.streams.PointDraw(
    data=points.columns(),
    source=points,
    drag=True,
    num_objects=POINT_LIMIT,
    styles={'fill_color': color_cycle.values[1:POINT_LIMIT+1], 'line_color': 'gray'}
)

posxy = hv.streams.PointerXY(source=map, x=xmid, y=ymid)
clickxy = hv.streams.Tap(source=map, x=xmid, y=ymid)

# Function to build spectral plot of clicked location to show on hover stream plot
def click_spectra(data):
    coordinates = []
    if data is None or not any(len(d) for d in data.values()):
        coordinates.append(clicked_points[0][0], clicked_points[1][0])
    else:
        coordinates = [c for c in zip(data['x'], data['y'])]
    
    plots = []
    for i, coords in enumerate(coordinates):
        x, y = coords
        data = ds_geo.sel(longitude=x, latitude=y, method="nearest")
        plots.append(
            data.hvplot.line(
                y="reflectance",
                x="wavelengths",
                color=color_cycle,
                label=f"{i}"
            )
        )
        points_stream.data["id"][i] = i
    return hv.Overlay(plots)

def hover_spectra(x,y):
    return ds_geo.sel(longitude=x,latitude=y,method='nearest').hvplot.line(y='reflectance',x='wavelengths',
                                                                           color='black', frame_width=400)
# Define the Dynamic Maps
click_dmap = hv.DynamicMap(click_spectra, streams=[points_stream])
hover_dmap = hv.DynamicMap(hover_spectra, streams=[posxy])

# Plot the Map and Dynamic Map side by side
hv.Layout(hover_dmap*click_dmap + map * points).cols(2).opts(
    hv.opts.Points(active_tools=['point_draw'], size=10, tools=['hover'], color='white', line_color='gray'),
    hv.opts.Overlay(show_legend=False, show_title=False, fontscale=1.5, frame_height=480)
)

After selecting a number of points we can build a dictionary of points and spectra, then export the spectra to a .csv file.

data = points_stream.data
wavelengths = ds_geo.wavelengths.values

rows = [["id", "x", "y"] + [str(i) for i in wavelengths]]
 
for p in zip(data['x'], data['y'], data['id']):
    x, y, i = p
    spectra = ds_geo.sel(longitude=x, latitude=y, method="nearest").reflectance.values
    row = [i, x, y] + list(spectra)
    rows.append(row)
with open('../../data/interactive_plot_data.csv', 'w', newline='') as f:
    writer = csv.writer(f)
    writer.writerows(rows)

Contact Info:

Email: LPDAAC@usgs.gov
Voice: +1-866-573-3222
Organization: Land Processes Distributed Active Archive Center (LP DAAC)¹
Website: https://lpdaac.usgs.gov/
Date last modified: 11-27-2023

¹Work performed under USGS contract G15PD00467 for NASA contract NNG14HH33I.