Skip to main content

Best Practices

Unit Testing

All code should reach minimum 80% coverage through Unit Tests.

Code Style

We use the Google Style Guide for style elements such as documentation, titling, and structure. We also recommend reviewing Clean Architecture

Stop on Failure

Failures within ORCA break through to the Cumulus workflow they are a part of. To this end, raising an error is preferred over catching the error and returning a null value or error message. The code examples below exemplify this idea by showing how to raise an error using python in different contexts.

try:
value = function(param)
except requests_db.DatabaseError as err:
logging.error(err)
raise
if not success:
logging.error(f"You may log additional information if desired. "
f"param: {param}")
raise DescriptiveErrorType(f'Error message to be raised info Cumulus workflow.')

Retries can then be configured in the workflow json if desired. See documentation and tutorials for more information. The following snippet from the copy_to_archive lambda demonstrates usage of retries for a lambda in an ingest workflow. MaxAttempts is set to 6, meaning that it will run the function a maximum of 7 times before transitioning to the WorkflowFailed state. IntervalSeconds determines how many seconds the workflow will sleep between retries. A BackOffRate of 2 means that the IntervalSeconds will be doubled on each failure beyond the first.

"CopyToArchive": {
...
"Type": "Task",
"Resource": "${copy_to_archive_task_arn}",
"Retry": [
{
"ErrorEquals": [
"Lambda.ServiceException",
"Lambda.AWSLambdaException",
"Lambda.SdkClientException"
],
"IntervalSeconds": 2,
"MaxAttempts": 6,
"BackoffRate": 2
}
],
"Catch": [
{
"ErrorEquals": [
"States.ALL"
],
"ResultPath": "$.exception",
"Next": "WorkflowFailed"
}
],
"Next": "WorkflowSucceeded"
},

If the retries are exceeded and the error is caught, then the workflow will show that it jumped to the WorkflowFailed state.

Workflow Failed

If the 'WorkflowFailed' state was not triggered, then the workflow will move on to the step defined in Next.

Workflow Succeeded
note

In the event that an error may be transient, and failing would cause a large amount of redundant work for other objects, retrying a failing operation in code is acceptable with a strictly limited number of retries. You will likely want to log each individual error for analytics and debugging.