Uncertain Data#
The progpy.uncertain_data package includes classes for representing data with uncertainty. All types of UncertainData can be operated on using the interface. Individual classes for representing uncertain data of different kinds are described below, in Implemented UncertainData Types.
Interface#
- class progpy.uncertain_data.UncertainData(_type=<class 'dict'>)#
 Abstract base class for data with uncertainty. Any new uncertainty type must implement this class
- abstract property cov: numpy.array#
 The covariance matrix of the UncertiantyData distribution or samples in order of keys (i.e., cov[1][1] is the standard deviation for key keys()[1])
- Returns
 Covariance matrix
- Return type
 np.array[np.array[float]]
Example
covariance_matrix = data.cov
- describe(title: str = 'UncertainData Metrics', print: bool = True) collections.defaultdict#
 Print and view basic statistical information about this UncertainData object in a text-based printed table.
- Parameters
 title – str Title of the table, printed before data rows.
print – bool = True Optional argument specifying whether to print or not; default true.
- Returns
 - defaultdict
 Dictionary of lists used to print metrics.
Example
data.describe()
- abstract keys()#
 Get the keys for the property represented
Example
keys = data.keys()
- abstract property mean: dict#
 The mean of the UncertainData distribution or samples
Example
mean_value = data.mean
- abstract property median: dict#
 The median of the UncertainData distribution or samples
Example
median_value = data.median
- metrics(**kwargs) dict#
 Calculate Metrics for this dist
- Keyword Arguments
 - Returns
 Dictionary of metrics
- Return type
 
Example
print(data.metrics()) m = data.metrics(ground_truth={'key1': 200, 'key2': 350}) m = data.metrics(keys=['key1', 'key3'])
- percentage_in_bounds(bounds: tuple, keys: list = None, n_samples: int = 1000) dict#
 Calculate percentage of dist is within specified bounds
- Parameters
 - Returns
 Percentage within bounds for each key in keys (where 0.5 = 50%). e.g., {‘key1’: 1, ‘key2’: 0.75}
- Return type
 
Example
data.percentage_in_bounds((1025, 1075)) data.percentage_in_bounds({'key1': (1025, 1075), 'key2': (2520, 2675)}) data.percentage_in_bounds((1025, 1075), keys=['key1', 'key3'])
- plot_hist(fig=None, keys=None, num_samples=100, **kwargs)#
 Create a histogram
- Parameters
 
Example
m = [5, 7, 3] c = [[0.3, 0.5, 0.1], [0.6, 0.7, 1e-9], [1e-9, 1e-10, 1]] d = MultivariateNormalDist(['a', 'b', 'c'], m, c) d.plot_hist() # With 100 samples states.plot_hist(num_samples=20) # Specifying the number of samples to plot states.plot_hist(keys=['a', 'b']) # only plot those keys
- plot_scatter(fig: matplotlib.figure.Figure = None, keys: list = None, num_samples: int = 100, **kwargs) matplotlib.figure.Figure#
 Produce a scatter plot
- Parameters
 fig (Figure, optional) – Existing figure previously used to plot states. If passed a figure argument additional data will be added to the plot. Defaults to creating new figure
keys (list[str], optional) – Keys to plot. Defaults to all keys.
num_samples (int, optional) – Number of samples to plot. Defaults to 100
**kwargs (optional) – Additional keyword arguments passed to scatter function.
- Returns
 Figure
Example
m = [5, 7, 3] c = [[0.3, 0.5, 0.1], [0.6, 0.7, 1e-9], [1e-9, 1e-10, 1]] d = MultivariateNormalDist(['a', 'b', 'c'], m, c) d.plot_scatter() # With 100 samples states.plot_scatter(num_samples=5) # Specifying the number of samples to plot states.plot_scatter(keys=['a', 'b']) # only plot those keys
- relative_accuracy(ground_truth: dict) dict#
 The relative accuracy is how close the mean of the distribution is to the ground truth, on relative terms
\(RA = 1 - \dfrac{\| r-p \|}{r}\)
Where r is ground truth and p is mean of predicted distribution 0
- Returns
 Relative accuracy for each event where value is relative accuracy between [0,1]
- Return type
 
Example
ra = data.relative_accuracy({'key1': 22, 'key2': 57})
References
- 0
 Prognostics: The Science of Making Predictions (Goebel et al, 239)
- abstract sample(nSamples: int = 1)#
 Generate samples from data
- Parameters
 nSamples (int, optional) – Number of samples to generate. Defaults to 1.
- Returns
 Array of nSamples samples
- Return type
 samples (UnweightedSamples)
Example
samples = data.samples(100)
Implemented UncertainData Types#
- class progpy.uncertain_data.UnweightedSamples(samples: list = [], _type=<class 'dict'>)#
 Uncertain Data represented by a set of samples. Objects of this class can be treated like a list where samples[n] returns the nth sample (Dict).
- Parameters
 samples (array, dict, or model.*Container, optional) –
array of samples. Defaults to empty array.
If dict, must be of the form of {key: [value, …], …}
If list, must be of the form of [{key: value, …}, …]
If InputContainer, OutputContainer, or StateContainer, must be of the form of *Container({‘key’: value, …})
- class progpy.uncertain_data.MultivariateNormalDist(labels, mean: numpy.array, covar: numpy.array, _type=<class 'dict'>)#
 Data represented by a multivariate normal distribution with mean and covariance matrix