Loading [MathJax]/extensions/Safe.js
Kamodo Analysis Suite
  • Kamodo Quick Start
  • Introduction
  • Data Functionalization
    • How to functionalize data
    • Generating a generic 1D Plot
    • Generating a generic 2D Plot
    • Adding new functionalized datasets to a kamodo object
    • Metadata functions
  • Functionalizing HAPI Results
  • Choosing Models and Variables
  • Functionalizing a Modeled Dataset
  • Satellite Trajectories
  • Coordinate Conversions
  • Performing a Flythrough in a Notebook
  • Performing a Flythrough from the Command Line
  • Constellation Mission Planning Tool
  • Advanced Plotting Routines
  • Contribution Guidelines
  • How to Write a Model Reader
Kamodo Analysis Suite
  • »
  • Data Functionalization

Data Functionalization¶

This notebook shows how to use the Functionalize_Dataset function to easily functionalize a given dataset, even if a custom interpolator is desired. See the first cell for details on the function execution.

How to functionalize data¶

The example below shows the creation of the inputs, including example datasets with 7 dimensions. Any number of dimensions can be functionalized.

In [1]:
Copied!
from kamodo_ccmc.tools.functionalize import Functionalize_Dataset
help(Functionalize_Dataset)
from kamodo_ccmc.tools.functionalize import Functionalize_Dataset help(Functionalize_Dataset)
Help on function Functionalize_Dataset in module kamodo_ccmc.tools.functionalize:

Functionalize_Dataset(coord_dict, data_dict, kamodo_object=None, coord_str='', func=None, func_default='data')
    Determine and call the correct functionalize routine.
    Inputs:
        coord_dict: a dictionary containing the coordinate information.
            {'name_of_coord1': {'units': 'coord1_units', 'data': coord1_data},
             'name_of_coord2': {'units': 'coord2_units', 'data': coord2_data},
             etc...}
            coordX_data should be a 1D array. All others should be strings.
        data_dict: a dictionary containing the data information.
            {'variable_name1': {'units': 'data1_units', 'data': data1_array},
             'variable_name2': {'units': 'data2_units', 'data': data2_array},
             etc...}
            dataX_array should have the same shape as
                (coord1, coord2, coord3, ..., coordN)
        Note:The datasets given in the data_dict dictionary should all have the
            same dimensions. Datasets with different dimensions can be
            functionalized by simply calling the function again with the other
            dataset and the associated coordinate arrays. The datasets must
            also EACH depend upon ALL of the coordinate arrays given.
        coord_str: a string indicating the coordinate system of the data
            (e.g. "SMcar" or "GEOsph").
        kamodo_object: the previously created kamodo object. If one is not
            given, then one will be created.
        func: the function to be used for interpolation through the given
            datasets. The function must accept values for interpolation in an
            identical call structure as SciPy's RegularGridInterpolator or
            interp1D. See SciPy's documentation for more information.
        func_default: a string indicating whether a custom interpolation
            method is dersired. The default is 'data', indicating that the
            standard interpolation method will be used. Set this to 'custom' to
            indicate that func is a custom interpolator.
    
    Output: A kamodo object with the functionalized dataset.
    
    This is similar to RU.Functionalize_Dataset, except only the gridded
        interpolator is registered.

In [2]:
Copied!
# Example of functionalizing a 7D array
import numpy as np
rng1 = np.random.RandomState(1)  # Seed the random generators differently
rng2 = np.random.RandomState(2)  # or the arrays created below will be identical.
coord_dict = {'time': {'units': 'hr', 'data': np.linspace(0., 24., 25)},
              'lon': {'units': 'deg', 'data': np.linspace(-180., 180., 12)},
              'lat': {'units': 'deg', 'data': np.linspace(-90., 90., 5)},
              'radius': {'units': 'R_E', 'data': np.linspace(0., 50., 10)},
              'nonsense': {'units': 'm/m', 'data': np.linspace(1., 15., 15)},
              'nope': {'units': 'm', 'data': np.linspace(1., 150., 25)},
              'nada': {'units': 'hPa', 'data': np.linspace(0.00005, 15000., 20)}}
var_dict = {'Test_7D': {'units': 'S', 'data': rng1.rand(25, 12, 5, 10, 15, 25, 20)},
            'Good_7D': {'units': 'mK', 'data': rng2.rand(25, 12, 5, 10, 15, 25, 20)}}
kamodo_object = Functionalize_Dataset(coord_dict, var_dict)
kamodo_object
# Example of functionalizing a 7D array import numpy as np rng1 = np.random.RandomState(1) # Seed the random generators differently rng2 = np.random.RandomState(2) # or the arrays created below will be identical. coord_dict = {'time': {'units': 'hr', 'data': np.linspace(0., 24., 25)}, 'lon': {'units': 'deg', 'data': np.linspace(-180., 180., 12)}, 'lat': {'units': 'deg', 'data': np.linspace(-90., 90., 5)}, 'radius': {'units': 'R_E', 'data': np.linspace(0., 50., 10)}, 'nonsense': {'units': 'm/m', 'data': np.linspace(1., 15., 15)}, 'nope': {'units': 'm', 'data': np.linspace(1., 150., 25)}, 'nada': {'units': 'hPa', 'data': np.linspace(0.00005, 15000., 20)}} var_dict = {'Test_7D': {'units': 'S', 'data': rng1.rand(25, 12, 5, 10, 15, 25, 20)}, 'Good_7D': {'units': 'mK', 'data': rng2.rand(25, 12, 5, 10, 15, 25, 20)}} kamodo_object = Functionalize_Dataset(coord_dict, var_dict) kamodo_object
Out[2]:
\begin{equation}\operatorname{Test_{7D}}(time[hr],lon[deg],lat[deg],radius[R_{E}],nonsense[1],nope[m],nada[hPa])[S] = \lambda{\left(time,lon,lat,radius,nonsense,nope,nada \right)}\end{equation} \begin{equation}\operatorname{Good_{7D}}(time[hr],lon[deg],lat[deg],radius[R_{E}],nonsense[1],nope[m],nada[hPa])[mK] = \lambda{\left(time,lon,lat,radius,nonsense,nope,nada \right)}\end{equation}

Generating a generic 1D Plot¶

Plot a 1D slice of all the variables by choosing a slice value in all but one dimension.

kamodo_object.plot('Test_7D', 'Good_7D', plot_partial={
    'Test_7D': {'time': 12., 'lon': 0.5, 'lat': -20., 'radius': 15., 'nonsense': 11.5, 'nope': 5.},
    'Good_7D': {'time': 12., 'lon': 0.5, 'lat': -20., 'radius': 15., 'nonsense': 11.5, 'nope': 5.}})

Screenshot

Generating a generic 2D Plot¶

Plot a 2D slice of one variable by choosing a slice value in all but two dimensions.

kamodo_object.plot('Test_7D', plot_partial={
    'Test_7D': {'time': 12., 'lon': 0.5, 'lat': -20., 'radius': 15., 'nonsense': 11.5}})

Screenshot

Adding new functionalized datasets to a kamodo object¶

In [3]:
Copied!
# You can add datasets of other dimensions to the same kamodo_object.
coord_dict = {'time': {'units': 'hr', 'data': np.linspace(0., 24., 25)}}
var_dict = {'Test_1D': {'units': 'S', 'data': rng1.rand(25)},
            'Good_1D': {'units': 'mK', 'data': rng2.rand(25)}}
kamodo_object = Functionalize_Dataset(coord_dict, var_dict, kamodo_object)
kamodo_object
# You can add datasets of other dimensions to the same kamodo_object. coord_dict = {'time': {'units': 'hr', 'data': np.linspace(0., 24., 25)}} var_dict = {'Test_1D': {'units': 'S', 'data': rng1.rand(25)}, 'Good_1D': {'units': 'mK', 'data': rng2.rand(25)}} kamodo_object = Functionalize_Dataset(coord_dict, var_dict, kamodo_object) kamodo_object
Out[3]:
\begin{equation}\operatorname{Test_{7D}}(time[hr],lon[deg],lat[deg],radius[R_{E}],nonsense[1],nope[m],nada[hPa])[S] = \lambda{\left(time,lon,lat,radius,nonsense,nope,nada \right)}\end{equation} \begin{equation}\operatorname{Good_{7D}}(time[hr],lon[deg],lat[deg],radius[R_{E}],nonsense[1],nope[m],nada[hPa])[mK] = \lambda{\left(time,lon,lat,radius,nonsense,nope,nada \right)}\end{equation} \begin{equation}\operatorname{Test_{1D}}(time[hr])[S] = \lambda{\left(time \right)}\end{equation} \begin{equation}\operatorname{Good_{1D}}(time[hr])[mK] = \lambda{\left(time \right)}\end{equation}

You can plot all of the functions on the same plot as long as the independent variable is the same (time in this example).

kamodo_object.plot('Test_1D', 'Good_1D', 'Test_7D', 'Good_7D', plot_partial={
    'Test_7D': {'lon': 0.5, 'lat': -20., 'radius': 15., 'nonsense': 11.5, 'nope': 5., 'nada': 12.},
    'Good_7D': {'lon': 0.5, 'lat': -20., 'radius': 15., 'nonsense': 11.5, 'nope': 5., 'nada': 12.}})

Screenshot

In [10]:
Copied!
# You even use a custom interpolator if desired for a new dataset added to the same kamodo_object.
# The interpolator must be defined separately for each dataset.
coord_dict = {'time': {'units': 'hr', 'data': np.linspace(0., 24., 25)},
              'lon': {'units': 'deg', 'data': np.linspace(-180., 180., 12)},
              'lat': {'units': 'deg', 'data': np.linspace(-90., 90., 5)}}
var_dict = {'TestCustomA_3D': {'units': 'S', 'data': rng1.rand(25, 12, 5)},
            'TestCustomB_3D': {'units': 'm/s', 'data': rng2.rand(25, 12, 5)*-2.}}

# Define a custom interpolator (simple example)
# see https://docs.scipy.org/doc/scipy/reference/generated/scipy.interpolate.RegularGridInterpolator.html
from numpy import NaN
from scipy.interpolate import RegularGridInterpolator as RGI
coord_list = [value['data'] for key, value in coord_dict.items()]
for key in var_dict.keys():
    rgi = RGI(coord_list, var_dict[key]['data'], bounds_error=False,
                fill_value=-10., method='nearest')
    # wrap in a function and return the function
    def interp(xvec):
        return rgi(xvec)
    tmp_dict = {key: var_dict[key]}  # construct a separate dictionary for the current variable
    kamodo_object = Functionalize_Dataset(coord_dict, tmp_dict, kamodo_object, func=interp, func_default='custom')
kamodo_object
# You even use a custom interpolator if desired for a new dataset added to the same kamodo_object. # The interpolator must be defined separately for each dataset. coord_dict = {'time': {'units': 'hr', 'data': np.linspace(0., 24., 25)}, 'lon': {'units': 'deg', 'data': np.linspace(-180., 180., 12)}, 'lat': {'units': 'deg', 'data': np.linspace(-90., 90., 5)}} var_dict = {'TestCustomA_3D': {'units': 'S', 'data': rng1.rand(25, 12, 5)}, 'TestCustomB_3D': {'units': 'm/s', 'data': rng2.rand(25, 12, 5)*-2.}} # Define a custom interpolator (simple example) # see https://docs.scipy.org/doc/scipy/reference/generated/scipy.interpolate.RegularGridInterpolator.html from numpy import NaN from scipy.interpolate import RegularGridInterpolator as RGI coord_list = [value['data'] for key, value in coord_dict.items()] for key in var_dict.keys(): rgi = RGI(coord_list, var_dict[key]['data'], bounds_error=False, fill_value=-10., method='nearest') # wrap in a function and return the function def interp(xvec): return rgi(xvec) tmp_dict = {key: var_dict[key]} # construct a separate dictionary for the current variable kamodo_object = Functionalize_Dataset(coord_dict, tmp_dict, kamodo_object, func=interp, func_default='custom') kamodo_object
Out[10]:
\begin{equation}\operatorname{Test_{7D}}(time[hr],lon[deg],lat[deg],radius[R_{E}],nonsense[1],nope[m],nada[hPa])[S] = \lambda{\left(time,lon,lat,radius,nonsense,nope,nada \right)}\end{equation} \begin{equation}\operatorname{Good_{7D}}(time[hr],lon[deg],lat[deg],radius[R_{E}],nonsense[1],nope[m],nada[hPa])[mK] = \lambda{\left(time,lon,lat,radius,nonsense,nope,nada \right)}\end{equation} \begin{equation}\operatorname{Test_{1D}}(time[hr])[S] = \lambda{\left(time \right)}\end{equation} \begin{equation}\operatorname{Good_{1D}}(time[hr])[mK] = \lambda{\left(time \right)}\end{equation} \begin{equation}\operatorname{TestCustomA_{3D}}(time[hr],lon[deg],lat[deg])[S] = \lambda{\left(time,lon,lat \right)}\end{equation} \begin{equation}\operatorname{TestCustomB_{3D}}(time[hr],lon[deg],lat[deg])[m / s] = \lambda{\left(time,lon,lat \right)}\end{equation}

Plot a 1D slice of all the variables by choosing a slice value in all but one dimension.

kamodo_object.plot('TestCustomB_3D', plot_partial={'TestCustomB_3D':{'time': 12.56}})

Screenshot

Metadata functions¶

In [6]:
Copied!
# Access the metadata
kamodo_object['Test_1D'].meta
# Access the metadata kamodo_object['Test_1D'].meta
Out[6]:
{'units': 'S',
 'arg_units': {'time': 'hr'},
 'citation': None,
 'equation': None,
 'hidden_args': []}
In [7]:
Copied!
# Add to the metadata
kamodo_object['Test_1D'].meta['description'] = 'Testing the functionalize.py script'
kamodo_object['Test_1D'].meta['citation'] = 'Ringuette et al. 2022'
kamodo_object['Test_1D'].meta
# Add to the metadata kamodo_object['Test_1D'].meta['description'] = 'Testing the functionalize.py script' kamodo_object['Test_1D'].meta['citation'] = 'Ringuette et al. 2022' kamodo_object['Test_1D'].meta
Out[7]:
{'units': 'S',
 'arg_units': {'time': 'hr'},
 'citation': 'Ringuette et al. 2022',
 'equation': None,
 'hidden_args': [],
 'description': 'Testing the functionalize.py script'}
In [8]:
Copied!
# See a pandas format output
kamodo_object.detail()
# See a pandas format output kamodo_object.detail()
Out[8]:
symbol units lhs rhs arg_units
Test_7D Test_7D(time, lon, lat, radius, nonsense, nope... S Test_7D lambda(time, lon, lat, radius, nonsense, nope,... {'time': 'hr', 'lon': 'deg', 'lat': 'deg', 'ra...
Good_7D Good_7D(time, lon, lat, radius, nonsense, nope... mK Good_7D lambda(time, lon, lat, radius, nonsense, nope,... {'time': 'hr', 'lon': 'deg', 'lat': 'deg', 'ra...
Test_1D Test_1D(time) S Test_1D lambda(time) {'time': 'hr'}
Good_1D Good_1D(time) mK Good_1D lambda(time) {'time': 'hr'}
TestCustomA_3D TestCustomA_3D(time, lon, lat) S TestCustomA_3D lambda(time, lon, lat) {'time': 'hr', 'lon': 'deg', 'lat': 'deg'}
TestCustomB_3D TestCustomB_3D(time, lon, lat) m/s TestCustomB_3D lambda(time, lon, lat) {'time': 'hr', 'lon': 'deg', 'lat': 'deg'}
In [9]:
Copied!
# Determine the dependent coordinates and the coordinate ranges
import kamodo_ccmc.flythrough.model_wrapper as MW
MW.Coord_Range(kamodo_object, ['Test_7D'])
# Determine the dependent coordinates and the coordinate ranges import kamodo_ccmc.flythrough.model_wrapper as MW MW.Coord_Range(kamodo_object, ['Test_7D'])
The minimum and maximum values for each variable and coordinate are:
Test_7D:
time: [0.0, 24.0, 'hr']
lon: [-180.0, 180.0, 'deg']
lat: [-90.0, 90.0, 'deg']
radius: [0.0, 50.0, 'R_E']
nonsense: [1.0, 15.0, 'm/m']
nope: [1.0, 150.0, 'm']
nada: [5e-05, 15000.0, 'hPa']
Previous Next

Built with MkDocs using a theme provided by Read the Docs.
« Previous Next »