# Import Packages
import sys
import os
import earthaccess
import numpy as np
import pandas as pd
import xarray as xr
import hvplot.pandas
import hvplot.xarray
import holoviews as hv
'../modules/')
sys.path.append(from emit_tools import emit_xarray
This notebook is from EMIT-Data-Resources
Source: How to Extract Points
Imported on: 2025-01-22
How to: Extracting EMIT Spectra at Specified Coordinates
Summary
In this notebook we will open a netCDF4 file from the Earth Surface Minteral Dust Source Investigation (EMIT) as an xarray.Dataset
. We will then extract the spectra at point coordinates from a .csv
as a dataframe, then save and plot the data.
Requirements: + A NASA Earthdata Login account is required to download EMIT data
+ Selected the emit_tutorials
environment as the kernel for this notebook. + For instructions on setting up the environment, follow the the setup_instructions.md
included in the /setup/
folder of the repository.
Learning Objectives + How to open an EMIT file as an xarray.Dataset
+ How to extract spectra at coordinates listed in a .csv
file
Import the required Python libraries.
Login to your NASA Earthdata account and create a .netrc
file using the login
function from the earthaccess
library. If you do not have an Earthdata Account, you can create one here.
=True) earthaccess.login(persist
For this notebook we will download the files necessary using earthaccess
. You can also access the data in place or stream it, but this can slow due to the file sizes. Provide a URL for an EMIT L2A Reflectance granule.
= 'https://data.lpdaac.earthdatacloud.nasa.gov/lp-prod-protected/EMITL2ARFL.001/EMIT_L2A_RFL_001_20220903T163129_2224611_012/EMIT_L2A_RFL_001_20220903T163129_2224611_012.nc' url
Get an HTTPS Session using your earthdata login, set a local path to save the file, and download the granule asset - This may take a while, the reflectance file is approximately 1.8 GB.
# Get requests https Session using Earthdata Login Info
= earthaccess.get_requests_https_session()
fs # Retrieve granule asset ID from URL (to maintain existing naming convention)
= url.split('/')[-1]
granule_asset_id # Define Local Filepath
= f'../../data/{granule_asset_id}'
fp # Download the Granule Asset if it doesn't exist
if not os.path.isfile(fp):
with fs.get(url,stream=True) as src:
with open(fp,'wb') as dst:
for chunk in src.iter_content(chunk_size=64*1024*1024):
dst.write(chunk)
Open the file downloaded and defined as fp
. To do this, we will use the emit_tools
module which contains a few helpful functions that can be used with EMIT data. Use the ortho=True
option to orthorectify the dataset.
= emit_xarray(fp, ortho=True)
ds ds
Now open the .csv
included in the /data/
directory as a pandas.dataframe
.
Note: The category values here are arbitrary and included as an example of an additional column users may want.
= pd.read_csv('../../data/sample_coords.csv')
points points
Make a plot to visualize the points we’re going to select on the dataset. Here we use the reflectance values at 850nm as our basemap. To mask fill-values, assign them to np.nan
.
# Assign fill-values to NaN
== -9999] = np.nan
ds.reflectance.data[ds.reflectance.data # Select a single band and plot points on top
=850,method='nearest').hvplot.image(cmap='greys', frame_height=600, frame_width=600, geo=True, crs='EPSG:4326').opts(title=f"Example Points")*\
ds.sel(wavelengths='Longitude',y='Latitude', by='ID', color=hv.Cycle('Dark2'), geo=True, crs='EPSG:4326') points.hvplot.points(x
Set the points
dataframe index as ID
to utilize our existing point ID’s as an index.
= points.set_index(['ID']) points
Convert the dataframe to an xarray.Dataset
= points.to_xarray()
xp xp
Select the data from our EMIT dataset using the Latitude and Longitude coordinates from our point dataset, then convert the output to a pandas.dataframe
.
= ds.sel(latitude=xp.Latitude,longitude=xp.Longitude, method='nearest').to_dataframe()
extracted extracted
The output is a longform dataframe using the 'ID'
field as an index. This is missing our 'Category'
column from our original dataframe. Use the pd.join
function to add the 'Category'
column to our dataset using 'ID'
as an index.
= extracted.join(points['Category'], on=['ID'])
df df
Now we have a dataframe containing our initial data, in addition to the extracted point data. This a a good place to save an output as a .csv
. Go ahead and do that below.
'../../data/example_out.csv') df.to_csv(
We can use our dataframe to plot the reflectance data we extracted, but first, mask the reflectance values of -0.01
, which represent deep water vapor absorption regions. To do this, assign values where the reflectance = -0.01
to np.nan
.
# Data values are slightly different than -0.01 because of floating point precision, which requires a tolerance to be set
'reflectance'] = df['reflectance'].where(abs(df['reflectance'] - (-0.01)) > 1e-8, np.nan) df[
Plot the data using hvplot
. We can use by=
to separate the reflectances by their ID
.
='wavelengths',y='reflectance', by=['ID'], color=hv.Cycle('Dark2'), frame_height=400, frame_width=600).opts(title='Example Points - Reflectance', xlabel='Wavelengths (nm)',ylabel='Reflectance') df.hvplot(x
Contact Info:
Email: LPDAAC@usgs.gov
Voice: +1-866-573-3222
Organization: Land Processes Distributed Active Archive Center (LP DAAC)¹
Website: https://lpdaac.usgs.gov/
Date last modified: 03-13-2024
¹Work performed under USGS contract G15PD00467 for NASA contract NNG14HH33I.