opera.util package

Submodules

opera.util.error_codes module

error_codes.py

Error codes for use with OPERA PGEs.

opera.util.error_codes.CODES_PER_RANGE = 1000: Number of error codes allocated to each range

opera.util.error_codes.CRITICAL_RANGE_START = 3000: Starting value for the Critical code range

opera.util.error_codes.DEBUG_RANGE_START = 1000: Starting value for the Debug code range

opera.util.error_codes.ERROR_CODE_PGE_OFFSET = 10000: Base offset used to distinguish error codes by PGE type

class opera.util.error_codes.ErrorCode(value)[source]

Bases: IntEnum

Error codes for OPERA PGEs.

Each code is combined with the designated error code offset defined by the RunConfig to determine the final, logged error code.

CATALOG_METADATA_CREATION_FAILED = 3003

CLOSING_LOG_FILE = 26

CONFIGURATION_DETAILS = 1000

CREATED_SAS_CONFIG = 16

CREATING_CATALOG_METADATA = 18

CREATING_ISO_METADATA = 19

CREATING_OUTPUT_FILE = 17

CREATING_WORKING_DIRECTORY = 5

DATE_RANGE_MISSING = 2000

DIRECTORY_CREATION_FAILED = 3001

DIRECTORY_SETUP_COMPLETE = 6

DYNAMIC_IMPORT_FAILED = 3022

FILENAME_VIOLATES_NAMING_CONVENTION = 3011

FILE_MOVE_FAILED = 3010

GRIB_TO_NETCDF_CONVERSION_FAILED = 3023

INPUT_FILE = 13

INPUT_NOT_FOUND = 3005

INVALID_CATALOG_METADATA = 3009

INVALID_INPUT = 3007

INVALID_OUTPUT = 3008

ISO_METADATA_CANT_RENDER_ONE_VARIABLE = 2002

ISO_METADATA_COULD_NOT_EXTRACT_METADATA = 3018

ISO_METADATA_DESCRIPTIONS_CONFIG_INVALID = 3025

ISO_METADATA_DESCRIPTIONS_CONFIG_NOT_FOUND = 3026

ISO_METADATA_GOT_SOME_RENDERING_ERRORS = 3017

ISO_METADATA_NO_DESCRIPTIONS = 2008

ISO_METADATA_NO_ENTRY_FOR_DESCRIPTION = 3027

ISO_METADATA_RENDER_FAILED = 3019

ISO_METADATA_TEMPLATE_NOT_FOUND = 3016

ISO_METADATA_TEMPLATE_NOT_PROVIDED_WHEN_NEEDED = 3024

LOADING_RUN_CONFIG_FILE = 2

LOGGED_CRITICAL_LINE = 3021

LOGGED_DEBUG_LINE = 1004

LOGGED_INFO_LINE = 27

LOGGED_WARNING_LINE = 2007

LOGGING_COULD_NOT_INCREMENT_SEVERITY = 2005

LOGGING_REQUESTED_SEVERITY_NOT_FOUND = 2003

LOGGING_RESYNC_FAILED = 2006

LOGGING_SOURCE_FILE_DOES_NOT_EXIST = 2004

LOG_FILE_CREATED = 1

LOG_FILE_CREATION_FAILED = 3004

LOG_FILE_INIT_COMPLETE = 4

MOVING_LOG_FILE = 7

MOVING_OUTPUT_FILE = 8

NO_ALGO_PARAM_SCHEMA_PATH = 29

NO_RENAME_FUNCTION_FOR_EXTENSION = 2001

OUTPUT_NOT_FOUND = 3006

OVERALL_SUCCESS = 0

PGE_NAME = 11

PROCESSING_DETAILS = 1001

PROCESSING_INPUT_FILE = 14

QA_SAS_PROGRAM_COMPLETED = 23

QA_SAS_PROGRAM_DISABLED = 24

QA_SAS_PROGRAM_FAILED = 3015

QA_SAS_PROGRAM_NOT_FOUND = 3014

QA_SAS_PROGRAM_STARTING = 22

RENDERING_ISO_METADATA = 25

RUN_CONFIG_FILENAME = 10

RUN_CONFIG_VALIDATION_FAILED = 3000

SAS_CONFIG_CREATION_FAILED = 3002

SAS_EXE_COMMAND_LINE = 1002

SAS_OUTPUT_FILE_HAS_MISSING_DATA = 3020

SAS_PROGRAM_COMPLETED = 21

SAS_PROGRAM_FAILED = 3013

SAS_PROGRAM_NOT_FOUND = 3012

SAS_PROGRAM_STARTING = 20

SAS_QA_COMMAND_LINE = 1003

SCHEMA_FILE = 12

SUMMARY_STATS_MESSAGE = 9

UPDATING_PRODUCT_METADATA = 28

USING_CONFIG_FILE = 15

VALIDATING_RUN_CONFIG_FILE = 3

classmethod describe()[source]: Provides a listing of the available error codes and their associated integer values.

opera.util.error_codes.INFO_RANGE_START = 0: Starting value for the Info code range

opera.util.error_codes.WARNING_RANGE_START = 2000: Starting value for the Warning code range

opera.util.logger module

logger.py

Logging utilities for use with OPERA PGEs.

This module is adapted for OPERA from the NISAR PGE R2.0.0 util/logger.py Original Authors: Alice Stanboli, David White Adapted By: Scott Collins, Jim Hofman

opera.util.logger.CRITICAL = 'Critical': Constants for logging levels

class opera.util.logger.PgeLogger(workflow=None, error_code_base=None, log_filename=None)[source]

Bases: object

Class to help with the PGE logging.

Advantages over the standalone write() function: * Opens and closes the log file for you * The class’s write() function has fewer arguments that need to be provided.

LOGGER_CODE_BASE = 900000

QA_LOGGER_CODE_BASE = 800000

append(source)[source]

Appends text from another file to this log file.

Parameters:: source (str) – The source text to append. If the source refers a file name, the contents of the file will be appended. Otherwise, the provided text is appended as is.

close_log_stream()[source]: Writes the log summary to the log stream Writes the log stream to a log file and saves the file to disk Closes the log stream

critical(module, error_code_offset, description)[source]

Write a critical-level message to the log.

Since critical messages should be used for unrecoverable errors, any time this log level is invoked a RuntimeError is raised with the description provided to this function. The log file is closed and finalized before the exception is raised.

Parameters:

module (str) – Name of the module where the logging took place.
error_code_offset (int) – Error code offset to add to this logger’s error code base value to determine the final error code associated with the log message.
description (str) – Description message to write to the log.

Raises:

RuntimeError – Raised when this method is called. The contents of the description parameter is provided as the exception string.

debug(module, error_code_offset, description)[source]

Write a debug-level message to the log.

Parameters:

module (str) – Name of the module where the logging took place.
error_code_offset (int) – Error code offset to add to this logger’s error code base value to determine the final error code associated with the log message.
description (str) – Description message to write to the log.

property error_code_base: Return the error code base from error_codes.py

get_critical_count()[source]: Returns the number of messages logged at the critical level.

get_file_name()[source]: Return the file name for the current log.

get_log_count_by_severity(severity)[source]

Gets the number of messages logged for the specified severity

Parameters:: severity (str) – The severity level to get the log count of. Should be one of info, debug, warning, critical (case-insensitive).
Returns:: log_count – The number of messages logged at the provided severity level.
Return type:: int

get_log_count_by_severity_dict()[source]: Returns a copy of the dictionary of log counts by severity.

get_stream_object()[source]: Return the stingIO object for the current log.

get_warning_count()[source]: Returns the number of messages logged at the warning level.

increment_log_count_by_severity(severity)[source]

Increments the logged message count of the provided severity level.

Parameters:: severity (str) – The severity level to increment the log count of. Should be one of info, debug, warning, critical (case-insensitive).

info(module, error_code_offset, description)[source]

Write an info-level message to the log.

Parameters:

module (str) – Name of the module where the logging took place.
error_code_offset (int) – Error code offset to add to this logger’s error code base value to determine the final error code associated with the log message.
description (str) – Description message to write to the log.

log(module, error_code_offset, description, additional_back_frames=0)[source]

Logs any kind of message.

Determines the log level (Critical, Warning, Info, or Debug) based on the provided error code offset.

Parameters:

module (str) – Name of the module where the logging took place.
error_code_offset (int) – Error code offset to add to this logger’s error code base value to determine the final error code associated with the log message.
description (str) – Description message to write to the log.
additional_back_frames (int, optional) – Number of call-stack frames to “back up” to in order to determine the calling function and line number.

log_one_metric(module, metric_name, metric_value, additional_back_frames=0)[source]

Writes one metric value to the log file.

Parameters:

module (str) – Name of the module where the logging took place.
metric_name (str) – Name of the metric being logged.
metric_value (object) – Value to associate to the logged metric.
additional_back_frames (int) – Number of call-stack frames to “back up” to in order to determine the calling function and line number.

move(new_filename)[source]

This function is useful when the log file has been given a default name, and needs to be assigned a name that meets the PGE file naming conventions.

Parameters:: new_filename (str) – The new filename (including path) to assign to this log file.

parse_line(line)[source]

Parses the provided formatted log line into its component parts according to the log formatting style for OPERA.

Parameters:: line (str) – The log line to parse
Returns:: parsed_line – The provided log line parsed into its component parts.
Return type:: tuple
Raises:: ValueError – If the line cannot be parsed according to the OPERA log formatting style.

warning(module, error_code_offset, description)[source]

Write a warning-level message to the log.

Parameters:

module (str) – Name of the module where the logging took place.
error_code_offset (int) – Error code offset to add to this logger’s error code base value to determine the final error code associated with the log message.
description (str) – Description message to write to the log.

property workflow: Return specific workflow

write(severity, module, error_code_offset, description, additional_back_frames=0)[source]

Write a message to the log.

Parameters:

severity (str) – The severity level to log at. Should be one of info, debug, warning, critical (case-insensitive).
module (str) – Name of the module where the logging took place.
error_code_offset (int) – Error code offset to add to this logger’s error code base value to determine the final error code associated with the log message.
description (str) – Description message to write to the log.
additional_back_frames (int, optional) – Number of call-stack frames to “back up” to in order to determine the calling function and line number.

write_log_summary()[source]: Writes a summary at the end of the log file, which includes totals of each message logged for each severity level, OS-level metrics, and total elapsed run time (since logger creation).

opera.util.logger.default_log_file_name()[source]

Returns a path + filename that can be used for the log file right away.

To minimize the risk of errors opening a log file, the initial log filename does not rely on anything read from a run config file, SAS output file, etc. Therefore, this filename does not follow the file naming convention.

Later (elsewhere), after everything is known, the log file will be renamed.

Returns:: file_path – Path to the default log file name.
Return type:: str

opera.util.logger.get_severity_from_error_code(error_code)[source]

Determines the log level (Critical, Warning, Info, or Debug) based on the provided error code.

Parameters:: error_code (int or ErrorCode) – The error code to map to a severity level.
Returns:: severity – The severity level associated to the provided error code.
Return type:: str

opera.util.logger.standardize_severity_string(severity)[source]

Returns the severity string in a consistent way.

Parameters:: severity (str) – The severity string to standardize.
Returns:: severity – The standardized severity string.
Return type:: str

opera.util.logger.write(log_stream, severity, workflow, module, error_code, error_location, description, time_tag=None)[source]

Low-level logging function. May be called directly in lieu of PgeLogger class.

Parameters:

log_stream (io.StringIO) – The log stream to write to.
severity (str) – The severity level of the log message.
workflow (str) – Name of the workflow where the logging took place.
module (str) – Name of the module where the logging took place.
error_code (int or ErrorCode) – The error code associated with the logged message.
error_location (str) – File name and line number where the logging took place.
description (str) – Description of the logged event.
time_tag (str, optional) – ISO format time tag to associate to the message. If not provided, the current time is used.

opera.util.run_utils module

run_utils.py

Contains utility functions for running executable processes within the OPERA PGE subsystem.

opera.util.run_utils.create_qa_command_line(qa_program_path, qa_program_options=None)[source]

Forms the appropriate command line for executing a SAS Quality Assurance (QA) application from parameters obtained from the RunConfig.

By default, this function assumes the QA program path corresponds to an executable file reaching within the current environment’s PATH. If this function cannot locate the executable, the QA program path is assumed to be a Python module path and treated accordingly.

Parameters:

qa_program_path (str) – The path to the QA executable to be invoked by the returned command line.
qa_program_options (list[str], optional) – List of options to include in the returned command line.

Returns:

command_line – The fully formed command line, returned in list format suitable for use with subprocess.run.

Return type:

list[str]

Raises:

OSError – If the QA executable exists within the current environment, but is not set with execute permissions for the current process.

opera.util.run_utils.create_sas_command_line(sas_program_path, sas_runconfig_path, sas_program_options=None)[source]

Forms the appropriate command line for executing a SAS from the parameters obtained from the RunConfig.

By default, this function assumes the SAS program path corresponds to an executable file reaching within the current environment’s PATH. If this function cannot locate the executable, the SAS program path is assumed to be a Python module path and treated accordingly.

Parameters:

sas_program_path (str) – The path to the SAS executable to be invoked by the returned command line.
sas_runconfig_path (str) – The path to the RunConfig to feed to the SAS executable in the returned command line.
sas_program_options (list[str], optional) – List of options to include in the returned command line.

Returns:

command_line – The fully formed command line, returned in list format suitable for use with subprocess.run.

Return type:

list[str]

Raises:

OSError – If the SAS executable exists within the current environment, but is not set with execute permissions for the current process.

opera.util.run_utils.get_checksum(file_name)[source]

Generate the MD5 checksum of the provided file.

This function was adapted from swot_pge.util.BasePgeWrapper.get_checksum()

Parameters:: file_name (str) – Path the file on disk to generate the checksum for.
Returns:: checksum – MD5 checksum of the provided file.
Return type:: str

opera.util.run_utils.get_extension(file_name)[source]: Returns the file extension (including the dot) of the provided file name.

opera.util.run_utils.get_traceback_from_log(log_contents)[source]

Utilizes a regular expression to parse and return a traceback stack from provided log contents.

Notes

The regular expression used with this function was derived from the following Stack Exchange answer: https://stackoverflow.com/a/53658873

Parameters:: log_contents (str) – The log contents to parse for a traceback stack.
Returns:: traceback_match – The result of the regex search for a traceback. If none could be found, None will be returned.
Return type:: re.Match

opera.util.run_utils.time_and_execute(command_line, logger, execute_via_shell=False)[source]

Executes the provided command line via subprocess while collecting the runtime of the execution.

Parameters:

command_line (Iterable[str]) – The command line program, including options/arguments, to execute. Each
logger (PgeLogger) – A logger object used to capture any error status returned from execution.
execute_via_shell (bool, optional) – If true, instruct subprocess.run to execute the command-line via system shell. Useful for running test commands but should generally not be used for production.

Returns:

elapsed_time – The time elapsed during execution, in seconds.

Return type:

float

opera.util.time module

time.py

Time-tag generation utilities for use with OPERA PGEs.

This module is adapted for OPERA from the NISAR PGE R2.0.0 util/time.py Original Author: Alice Stanboli Adapted By: Scott Collins

opera.util.time.get_catalog_metadata_datetime_str(date_time)[source]

Converts the provided datetime object to a time-tag string suitable for use in catalog metadata.

Parameters:: date_time (datetime.datetime) – Datetime object to convert to a catalog metadata time-tag string.
Returns:: datetime_str – The provided time converted to ISO format, including nanosecond resolution.
Return type:: str

opera.util.time.get_current_iso_time()[source]

Returns current time in ISO format, including trailing “Z” to indicate Zulu (GMT) time.

Returns:: time_in_iso – Current time in ISO format: YYYY-MM-DDTHH:MM:SS.mmmmmmZ
Return type:: str

opera.util.time.get_iso_time(date_time)[source]

Converts the provided datetime object to an ISO-format time-tag.

Parameters:: date_time (datetime.datetime) – Convert to ISO format.
Returns:: time_in_iso – Provided time in ISO format: YYYY-MM-DDTHH:MM:SS.mmmmmmZ
Return type:: str

opera.util.time.get_time_for_filename(date_time)[source]

Converts the provided datetime object to a time-tag string suitable for use with output filenames.

Parameters:: date_time (datetime.datetime) – Datetime object to convert to a filename time-tag.
Returns:: datetime_str – The provided time converted to YYYYMMDDTHHmmss format.
Return type:: str

opera.util.usage_metrics module

usage_metrics.py

OS-level metrics gathering functions for use with OPERA PGEs.

This module is adapted for OPERA from the NISAR PGE R2.0.0 util/usage_metrics.py Original Author: David White Adapted By: Scott Collins

opera.util.usage_metrics.get_os_metrics()[source]

Gets metrics related to machine resource usage, by the current process and by all of its children processes.

Returns:

metrics –

Dictionary containing metrics mapped to the following keys:

os.cpu.seconds.sys -: System CPU time, in seconds, consumed by the current process and its children
os.cpu.seconds.user -: User CPU time, in seconds, consumed by the current process and its children
os.filesystem.reads -: Number of file system reads performed by the current process and its children
os.filesystem.writes -: Number of file system writes performed by the current process and its children
os.max_rss_kb.main_process -: Maximum resident set size (physical memory consumption), in kilobytes, of the current process
os.max_rss_kb.largest_child_process -: Maximum resident set size (physical memory consumption), in kilobytes, of the largest child process
os.peak_vm_kb.main_process -: Peak virtual memory usage, in kilobytes, of the current process.

Return type:

dict

opera.util.usage_metrics.get_self_peak_vmm_kb()[source]

Attempt to get the peak virtual memory by looking into the /proc/self/status

Note that this accounts for the peak virtual memory of just the current process, not the sum of the current process and all its children.

Returns:: vm_peak_kb – Peak virtual memory usage, in kilobytes, of the current process. If this value cannot be obtained for any reason, -1 is returned instead.
Return type:: int

Module contents

util

Contains utility modules for performing common operations, such as logging and metrics gathering, for use with the OPERA PGE Subsystem.