This notebook is from EMIT-Data-Resources

Source: How to Extract Area

Imported on: 2025-01-22

How to: Extracting EMIT Spectra using a Shapefile/GeoJSON

Summary

In this notebook we will open a netCDF4 file from the Earth Surface Minteral Dust Source Investigation (EMIT) as an xarray.Dataset. We will then extract extract or clip to an area using a .geojson file (will also work with shapefile). The workflows outlined here will work with reflectance L2A or radiance L1B data.

Requirements: + A NASA Earthdata Login account is required to download EMIT data
+ Selected the emit_tutorials environment as the kernel for this notebook. + For instructions on setting up the environment, follow the the setup_instructions.md included in the /setup/ folder of the repository.

Learning Objectives
- How to open and EMIT Dataset as an xarray.Dataset - How to extract values or clip an EMIT dataset to a region of interest - How to write a new netCDF4 output using the clipped data

Import the required Python libraries.

# Import Packages
import os
import earthaccess
import numpy as np
import xarray as xr
from osgeo import gdal
import rasterio as rio
import rioxarray as rxr
import hvplot.xarray
import hvplot.pandas
import holoviews as hv
import geopandas as gp
import sys
sys.path.append('../modules/')
from emit_tools import emit_xarray

Login to your NASA Earthdata account and create a .netrc file using the login function from the earthaccess library. If you do not have an Earthdata Account, you can create one here.

earthaccess.login(persist=True)

For this notebook we will download the files necessary using earthaccess. You can also access the data in place or stream it, but this can slow due to the file sizes. Provide a URL for an EMIT L2A Reflectance granule.

url = 'https://data.lpdaac.earthdatacloud.nasa.gov/lp-prod-protected/EMITL2ARFL.001/EMIT_L2A_RFL_001_20220903T163129_2224611_012/EMIT_L2A_RFL_001_20220903T163129_2224611_012.nc'

Get an HTTPS Session using your earthdata login, set a local path to save the file, and download the granule asset - This may take a while, the reflectance file is approximately 1.8 GB.

# Get requests https Session using Earthdata Login Info
fs = earthaccess.get_requests_https_session()
# Retrieve granule asset ID from URL (to maintain existing naming convention)
granule_asset_id = url.split('/')[-1]
# Define Local Filepath
fp = f'../../data/{granule_asset_id}'
# Download the Granule Asset if it doesn't exist
if not os.path.isfile(fp):
    with fs.get(url,stream=True) as src:
        with open(fp,'wb') as dst:
            for chunk in src.iter_content(chunk_size=64*1024*1024):
                dst.write(chunk)

Open the file downloaded and defined as fp. To do this, we will use the emit_xarray function from the emit_tools module. This module contains a few helpful functions that can be used with EMIT data.

ds = emit_xarray(fp, ortho=True)
ds

Using the read_file() function from geopandas, read in the .geojson file containing the polygon you wish to extract.

shape = gp.read_file('../../data/isla_gaviota.geojson')
shape

First, mask out fill-values by setting them to np.nan, then plot the polygon we’ve loaded overlayed on a plot of the dataset.

# Set fill values to NaN
ds.reflectance.data[ds.reflectance.data == -9999] = np.nan
# Select band and plot with polygon
ds_850 = ds.sel(wavelengths=850,method='nearest')
ds_850.hvplot.image(cmap='viridis', frame_height=600, frame_width=600, geo=True, crs='EPSG:4326').opts(title=f"Reflectance at {ds_850.wavelengths.data:.3f} ({ds_850.wavelengths.units})")*shape.hvplot(color='#d95f02', alpha=0.5, geo=True, crs='EPSG:4326')

Use the clip function from rasterio to clip the dataset to polygons from the geopandas.geodataframe. Setting all_touched to True will include pixels that intersected with the edges of the polygon.

clipped = ds.rio.clip(shape.geometry.values,shape.crs, all_touched=True)
clipped

To view the clipped image, select a band from the clipped dataset and plot it spatially.

clipped_850 = clipped.sel(wavelengths=850,method='nearest')
clipped_850.hvplot.image(cmap='viridis', frame_height=600, geo=True, tiles='ESRI').opts(
    title=f'Reflectance at {clipped_850.wavelengths.data:.3f} ({clipped_850.wavelengths.units})',
    xlabel='Longitude', ylabel='Latitude')

Now we can save the clipped xarray.Dataset as a netCDF4 output that can be reopened using the xarray.open_dataset function.

clipped.to_netcdf('../../data/clipped_data.nc')
# Example for Opening 
#ds = xr.open_dataset('../../data/clipped_data.nc')

Contact Info:

Email: LPDAAC@usgs.gov
Voice: +1-866-573-3222
Organization: Land Processes Distributed Active Archive Center (LP DAAC)¹
Website: https://lpdaac.usgs.gov/
Date last modified: 03-13-2024

¹Work performed under USGS contract G15PD00467 for NASA contract NNG14HH33I.