Uncertain Data#

The progpy.uncertain_data package includes classes for representing data with uncertainty. All types of UncertainData can be operated on using the interface. Inidividual classes for representing uncertain data of different kinds are described below, in Implemented UncertainData Types.

Interface#

class progpy.uncertain_data.UncertainData(_type=<class 'dict'>)#

Abstract base class for data with uncertainty. Any new uncertainty type must implement this class

abstract property cov#

The covariance matrix of the UncertiantyData distribution or samples in order of keys (i.e., cov[1][1] is the standard deviation for key keys()[1])

Returns: Covariance matrix
Return type: np.array[np.array[float]]

Example

covariance_matrix = data.cov

describe(title: str = 'UncertainData Metrics', print: bool = True) → collections.defaultdict#

Print and view basic statistical information about this UncertainData object in a text-based printed table.

Parameters

title – str Title of the table, printed before data rows.
print – bool = True Optional argument specifying whether to print or not; default true.

Returns

defaultdict: Dictionary of lists used to print metrics.

Example

data.describe()

abstract keys()#

Get the keys for the property represented

Returns: keys
Return type: list[str]

Example

keys = data.keys()

abstract property mean#

The mean of the UncertainData distribution or samples

Returns: Mean value. e.g., {‘key1’: 23.2, …}
Return type: dict[str, float]

Example

mean_value = data.mean

abstract property median#

The median of the UncertainData distribution or samples

Returns: Median value. e.g., {‘key1’: 23.2, …}
Return type: dict[str, float]

Example

median_value = data.median

metrics(**kwargs) → dict#

Calculate Metrics for this dist

Keyword Arguments

ground_truth (int or dict, optional) – Ground truth value. Defaults to None.
n_samples (int, optional) – Number of samples to use for calculating metrics (if not UnweightedSamples)
keys (list[str], optional) – Keys to calculate metrics for. Defaults to all keys.

Returns

Dictionary of metrics

Return type

dict

Example

print(data.metrics())
m = data.metrics(ground_truth={'key1': 200, 'key2': 350})
m = data.metrics(keys=['key1', 'key3'])

percentage_in_bounds(bounds: tuple, keys: list = None, n_samples: int = 1000) → dict#

Calculate percentage of dist is within specified bounds

Parameters

bounds (tuple[float, float] or dict) –
Lower and upper bounds.

if tuple: (lower, upper)

if dict: {key: (lower, upper), …}
keys (list[str], optional) – UncertainData keys to consider when calculating. Defaults to all keys.
n_samples (int, optional) – Number of samples to use when calculating

Returns

Percentage within bounds for each key in keys (where 0.5 = 50%). e.g., {‘key1’: 1, ‘key2’: 0.75}

Return type

dict

Example

data.percentage_in_bounds((1025, 1075))
data.percentage_in_bounds({'key1': (1025, 1075), 'key2': (2520, 2675)})
data.percentage_in_bounds((1025, 1075), keys=['key1', 'key3'])

plot_hist(fig=None, keys=None, num_samples=100, **kwargs)#

Create a histogram

Parameters

fig (MatPlotLib Figure, optional) – Existing histogram figure to be overritten. Defaults to create new figure.
num_samples (int, optional) – Number of samples to plot. Defaults to 100
keys (list(String), optional) – Keys to be plotted. Defaults to None.

Example

m = [5, 7, 3]
c = [[0.3, 0.5, 0.1], [0.6, 0.7, 1e-9], [1e-9, 1e-10, 1]]
d = MultivariateNormalDist(['a', 'b', 'c'], m, c)
d.plot_hist() # With 100 samples
states.plot_hist(num_samples=20) # Specifying the number of samples to plot
states.plot_hist(keys=['a', 'b']) # only plot those keys

plot_scatter(fig: matplotlib.figure.Figure = None, keys: list = None, num_samples: int = 100, **kwargs) → matplotlib.figure.Figure#

Produce a scatter plot

Parameters

fig (Figure, optional) – Existing figure previously used to plot states. If passed a figure argument additional data will be added to the plot. Defaults to creating new figure
keys (list[str], optional) – Keys to plot. Defaults to all keys.
num_samples (int, optional) – Number of samples to plot. Defaults to 100
**kwargs (optional) – Additional keyword arguments passed to scatter function.

Returns

Figure

Example

m = [5, 7, 3]
c = [[0.3, 0.5, 0.1], [0.6, 0.7, 1e-9], [1e-9, 1e-10, 1]]
d = MultivariateNormalDist(['a', 'b', 'c'], m, c)
d.plot_scatter() # With 100 samples
states.plot_scatter(num_samples=5) # Specifying the number of samples to plot
states.plot_scatter(keys=['a', 'b']) # only plot those keys

relative_accuracy(ground_truth: dict) → dict#

The relative accuracy is how close the mean of the distribution is to the ground truth, on relative terms

$R A = 1 - \frac{‖ r - p ‖}{r}$

Where r is ground truth and p is mean of predicted distribution 0

Returns: Relative accuracy for each event where value is relative accuracy between [0,1]
Return type: dict[str, float]

Example

ra = data.relative_accuracy({'key1': 22, 'key2': 57})

References

0: Prognostics: The Science of Making Predictions (Goebel et al, 239)

abstract sample(nSamples: int = 1)#

Generate samples from data

Parameters: nSamples (int, optional) – Number of samples to generate. Defaults to 1.
Returns: Array of nSamples samples
Return type: samples (UnweightedSamples)

Example

samples = data.samples(100)

Implemented UncertainData Types#

class progpy.uncertain_data.UnweightedSamples(samples: list = [], _type=<class 'dict'>)#

Uncertain Data represented by a set of samples. Objects of this class can be treated like a list where samples[n] returns the nth sample (Dict).

Parameters

samples (array, dict, or model.*Container, optional) –

array of samples. Defaults to empty array.

If dict, must be of the form of {key: [value, …], …}

If list, must be of the form of [{key: value, …}, …]

If InputContainer, OutputContainer, or StateContainer, must be of the form of *Container({‘key’: value, …})

key(key) → list#

Return samples for given key

Parameters: key (str) – key
Returns: list of values for given key
Return type: list

Uncertain Data

Contents

Uncertain Data#

Interface#

Implemented UncertainData Types#