Skip to main content

ORCA API Reference

Overview

The purpose of this page is to give developers information on how to use the ORCA API and explain the expected inputs, outputs and paths. The API can be used to get metadata information about a granule, recovery job or to get information on internal reconciliation reports and accepts and responds with JSON payloads at various HTTPS endpoints. All ORCA APIs use the POST method. All API endpoints use AWS IAM authorization.

danger

If an Aurora Serverless database is used for data-storage, and it has not been accessed in some time, then it may take 30-40 seconds for the database to become available. As AWS limits API invocations to 30 seconds, with no option of increase, this can cause API invocations to result in HTTP error code 504. Integrators should include appropriate handling/retry code.

Catalog reporting API

The catalog/reconcile API call provides a user with the current listing of the ORCA catalog that can be used to reconcile granule and file information against a master catalog. For example, comparing the Cumulus primary archive holdings against the ORCA holdings to find discrepancies. Catalog reporting API input invoke URL example: https://example.execute-api.us-west-2.amazonaws.com/orca/catalog/reconcile

Catalog reporting API input

An example of the API input body is shown below:

{
"pageIndex": 0,
"providerId": ["lpdaac"],
"collectionId": ["MOD14A1__061"],
"granuleId": ["MOD14A1.061.A23V45.2020235"],
"startTimestamp": "628021800000",
"endTimestamp": "628021900000"
}

The following table lists the fields in the input:

NameData TypeDescriptionRequired
pageIndexintThe 0-based index of the results page to return.Yes
endTimestampintCumulus granule createdAt end-time for date range to compare data, in milliseconds since 1 January 1970 UTC.Yes
providerIdArray[str]The unique ID of the provider making the request.No
collectionIdArray[str]The unique ID of collection to compare.No
granuleIdArray[str]The unique ID of granule to compare.No
startTimestampintCumulus granule createdAt start time for date range to compare data, in milliseconds since 1 January 1970 UTC.No

Catalog reporting API output

An example of the API output is shown below:

{
"anotherPage": false,
"granules": [
{
"providerId": "lpdaac",
"collectionId": "MOD14A1___061",
"id": "MOD14A1.061.A23V45.2020235",
"createdAt": "628021850000",
"executionId": "u654-123-Yx679",
"ingestDate": "628021950000",
"lastUpdate": "628021970000",
"files": [
{
"name": "MOD14A1.061.A23V45.2020235.2020240145621.hdf",
"cumulusArchiveLocation": "cumulus-bucket",
"orcaArchiveLocation": "orca-archive",
"keyPath": "MOD14A1/061/032/MOD14A1.061.A23V45.2020235.2020240145621.hdf",
"sizeBytes": 100934568723,
"hash": "ACFH325128030192834127347",
"hashType": "SHA-256",
"storageClass": "GLACIER",
"version": "VXCDEG902"
}
]
}
]
}

The following table lists the fields in the output:

NameData TypeDescription
anotherPageBooleanIndicates if more results can be retrieved on another page.
granulesArray[Object]A list of objects representing individual files to copy.
providerIdintThe unique ID of the provider making the request.
collectionIdstrThe unique ID of collection to compare.
idstrThe unique ID of the granule.
createdAtintThe time, in milliseconds since 1 January 1970 UTC, data was originally ingested into cumulus.
executionIdstrStep function execution ID from AWS.
ingestDateintThe time, in milliseconds since 1 January 1970 UTC, that the data was originally ingested into ORCA.
lastUpdateintThe time, in milliseconds since 1 January 1970 UTC, that information was updated.
filesArray[Object]Description and status of the files within the given granule.
namestrThe name and extension of the file.
cumulusArchiveLocationstrCumulus bucket the file resides in.
orcaArchiveLocationstrArchive bucket the file resides in.
keyPathstrS3 path to the file including the file name and extension, but not the bucket.
sizeBytesstrSize in bytes of the file. From Cumulus ingest.
hashstrChecksum hash of the file provided by Cumulus.
hashTypestrHash type used to calculate the hash value of the file.
storageClassstrThe class of storage containing the file.
versionstrAWS provided version of the file.

The API returns status code 200 on success, 400 if pageIndex or endTimestamp is missing and 500 if an error occurs when querying the database.

Recovery granules API

The recovery/granules API call relates to an ORCA recovery job status and returns detailed status of the granule.

Recovery granules API input invoke URL example: https://example.execute-api.us-west-2.amazonaws.com/orca/recovery/granules

Recovery granules API input

An example of the API input body is shown below:

{
"granule_id": "MOD14A1.061.H5V12.2020312.141531789",
"asyncOperationId": "43c9751b-9498-4733-90d8-56b1458e0f85"
}

The following table lists the fields in the input:

NameData TypeDescriptionRequired
granuleIdstrThe unique ID of the granule to retrieve status for.Yes
asyncOperationIdstrThe unique ID of the asyncOperation. May apply to a request that covers multiple granules.No

Recovery granules API output

An example of the API output is shown below:

{
"granuleId": "MOD14A1.061.H5V12.2020312.141531789",
"asyncOperationId": "43c9751b-9498-4733-90d8-56b1458e0f85",
"files": [
{
"fileName": "f1.doc",
"status": "pending"
},
{
"fileName": "f2.pdf",
"status": "error",
"error_message": "Access Denied"
},
{
"fileName": "f3.txt",
"status": "success"
}
],
"restoreDestination": "bucket_name",
"requestTime": 628021800000,
"completionTime": 628021900000
}

The following table lists the fields in the output:

NameData TypeDescription
granuleIdstrThe unique ID of the granule retrieved.
asyncOperationIdstrThe unique ID of the asyncOperation.
filesArray[Object]Description and status of the files within the given granule.
fileNamestrThe name and extension of the file.
statusstrThe status of the restoration of the file. May be 'pending', 'staged', 'success', or 'error'.
errorMessagestrIf the restoration of the file showed error, the error will be stored here.
restoreDestinationstrThe name of the archive bucket the granule is being copied to.
requestTimeintThe time, in milliseconds since 1 January 1970 UTC, when the request to restore the granule was initiated.
completionTimeintThe time, in milliseconds since 1 January 1970 UTC, when all granule_files were in an end state.

The API returns status code 200 on success, 400 if input is in incorrect format, 500 if an error occurs when querying the database and 404 if not found.

Recovery jobs API

The recovery/jobs API call returns detailed status for a particular recovery job. Recovery job API input invoke URL example: https://example.execute-api.us-west-2.amazonaws.com/orca/recovery/jobs

Recovery jobs API input

An example of the API input body is shown below:

{
"asyncOperationId": "43c9751b-9498-4733-90d8-56b1458e0f85"
}

The following table lists the fields in the input:

NameData TypeDescriptionRequired
asyncOperationIdstrThe unique ID of the asyncOperation of the recovery job.Yes

Recovery jobs API output

An example of the API output is shown below:

{
"asyncOperationId": "43c9751b-9498-4733-90d8-56b1458e0f85",
"jobStatusTotals": {
"pending": 1,
"success": 1,
"error": 0,
"staged": 0
},
"granules": [
{
"granuleId": "6c8d0c8b-4f9a-4d87-ab7c-480b185a0250",
"status": "error"
},
{
"granuleId": "b5681dc1-48ba-4dc3-877d-1b5ad97e8276",
"status": "pending"
}
]
}

The following table lists the fields in the output:

NameData TypeDescription
asyncOperationIdstrThe unique ID of the asyncOperation.
jobStatusTotalsObjectSum of how many granules are in each particular restoration status ('pending', 'staged', 'success', or 'error').
granulesArray[Object]An array representing each granule being copied as part of the job.
granuleIdstrThe unique ID of the granule retrieved.
statusstrThe status of the restoration of the file. May be 'pending', 'staged', 'success', or 'error'.

The API returns status code 200 on success, 400 if input is in incorrect format, 500 if an error occurs when querying the database, and 404 if not found.

Internal Reconcile report jobs API

The orca/datamanagement/reconciliation/internal/jobs API call receives page index from end user and returns available internal reconciliation jobs from the Orca database. Internal reconcile report jobs API input invoke URL example: https://example.execute-api.us-west-2.amazonaws.com/orca/datamanagement/reconciliation/internal/jobs

Internal Reconcile report jobs API input

An example of the API input body is shown below:

{
"pageIndex": 0
}

The following table lists the fields in the input:

NameData TypeDescriptionRequired
pageIndexintThe 0-based index of the results page to return.Yes

Internal Reconcile report jobs API output

An example of the API output is shown below:

{
"anotherPage": false,
"jobs": [
{
"id": 826,
"orcaArchiveLocation": "PREFIX-orca-primary",
"status": "success",
"inventoryCreationTime": 1652227200000,
"lastUpdate": 1652299312334,
"errorMessage": null,
"reportTotals": {
"orphan": 0,
"phantom": 1,
"catalogMismatch": 1
}
},
{
"id": 793,
"orcaArchiveLocation": "doctest-orca-primary",
"status": "error",
"inventoryCreationTime": 1652140800000,
"lastUpdate": 1652198623479,
"errorMessage": "Error while posting mismatches to database.",
"reportTotals": {
"orphan": 2,
"phantom": 1,
"catalogMismatch": 0
}
}
]
}

The following table lists the fields in the output:

NameData TypeDescription
anotherPageboolIndicates if more results can be retrieved on another page.
jobsArray[Object]The jobs on the page.
idintThe unique ID of the reconciliation job.
orcaArchiveLocationstrArchive bucket the reconciliation targets.
statusstrCurrent status of the job. getting S3 list, staged, generating reports, error, or success
inventoryCreationTimeintThe time, in milliseconds since 1 January 1970 UTC, of inventory report initiation time from the s3 manifest.
lastUpdateintThe time, in milliseconds since 1 January 1970 UTC, when status was last updated.
errorMessagestr or nullCritical error the job ran into that prevented it from finishing.
reportTotalsObjectThe number of error reports of each type.
orphanintNumber of files that have records in the S3 archive bucket but are missing in the ORCA catalog.
phantomintNumber of files that have records in the ORCA catalog but are missing from S3 bucket.
catalogMismatchintNumber of files that are missing from ORCA S3 bucket or have different metadata values than what is expected.

The API returns status code 200 on success, 400 if jobId or pageIndex are missing and 500 if an error occurs.

Internal Reconcile report orphan API

The orca/datamanagement/reconciliation/internal/jobs/job/{jobid}/orphans API call receives job id and page index from end user and returns reporting information of files that have records in the S3 archive bucket but are missing in the ORCA catalog from the internal reconciliation job. Note that {jobid} is optional. Internal reconcile report orphan API input invoke URL example: https://example.execute-api.us-west-2.amazonaws.com/orca/datamanagement/reconciliation/internal/jobs/job/{jobid}/orphans

Internal Reconcile report orphan API input

An example of the API input body is shown below:

{
"jobId": 123,
"pageIndex": 0
}

The following table lists the fields in the input:

NameData TypeDescriptionRequired
jobIdintThe unique job ID of the reconciliation job.Yes
pageIndexintThe 0-based index of the results page to return.Yes

Internal Reconcile report orphan API output

An example of the API output is shown below:

{
"jobId": 123,
"anotherPage": false,
"orphans": [
{
"keyPath": "MOD09GQ/006/MOD09GQ.A2017025.h21v00.006.2017034065109.hdf",
"s3Etag": "d41d8cd98f00b204e9800998ecf8427",
"s3FileLastUpdate": 1654878716000,
"s3SizeInBytes": 6543277389,
"s3StorageClass": "GLACIER"
}
]
}

The following table lists the fields in the output:

NameData TypeDescription
jobIdstrThe unique ID of the reconciliation job.
anotherPageBooleanIndicates if more results can be retrieved on another page.
orphansArray[Object]An array representing each orphan if available.
keyPathstrKey path and filename of the object in S3 bucket.
s3Etagstretag of the object in S3 bucket.
s3FileLastUpdateintThe time, in milliseconds since 1 January 1970 UTC, of last update of the object in S3 bucket.
s3SizeInBytesintSize in bytes of the object in S3 bucket.
s3StorageClassstrAWS storage class the object is in the S3 bucket.

The API returns status code 200 on success, 400 if jobId or pageIndex are missing and 500 if an error occurs.

Internal Reconcile report phantom API

The orca/datamanagement/reconciliation/internal/jobs/job/{jobid}/phantoms API call receives job id and page index from end user and returns reporting information of files that have records in the ORCA catalog but are missing from S3 bucket. Note that {jobid} is optional. Internal reconcile report phantom API input invoke URL example: https://example.execute-api.us-west-2.amazonaws.com/orca/datamanagement/reconciliation/internal/jobs/job/{jobid}/phantoms

Internal Reconcile report phantom API input

An example of the API input body is shown below:

{
"jobId": 123,
"pageIndex": 0
}

The following table lists the fields in the input:

NameData TypeDescriptionRequired
jobIdintThe unique job ID of the reconciliation job.Yes
pageIndexintThe 0-based index of the results page to return.Yes

Internal Reconcile report phantom API output

An example of the API output is shown below:

{
"jobId": 123,
"anotherPage": false,
"phantoms": [
{
"collectionId": "MOD09GQ___061",
"granuleId": "MOD09GQ.A2017025.h21v00.006.2017034065109",
"filename": "MOD09GQ.A2017025.h21v00.006.2017034065109.hdf",
"keyPath": "MOD09GQ/006/MOD09GQ.A2017025.h21v00.006.2017034065109.hdf",
"orcaEtag": "d41d8cd98f00b204e9800998ecf8427",
"orcaGranuleLastUpdate": 1654878715868,
"orcaSizeInBytes": 6543277389,
"orcaStorageClass": "GLACIER"
}
]
}

The following table lists the fields in the output:

NameData TypeDescription
jobIdstrThe unique ID of the reconciliation job.
anotherPageBooleanIndicates if more results can be retrieved on another page.
phantomsArray[Object]An array representing each phantoms if available.
collectionIdstrCumulus Collection ID value from the ORCA catalog.
granuleIdstrCumulus granuleID value from the ORCA catalog.
filenamestrFilename of the object from the ORCA catalog.
keyPathstrkey path and filename of the object in the ORCA catalog.
orcaEtagstretag of the object as reported in the ORCA catalog.
orcaGranuleLastUpdateintThe time, in milliseconds since 1 January 1970 UTC, of last update of the object as reported in the ORCA catalog.
orcaSizeInBytesintSize in bytes of the object as reported in the ORCA catalog.
orcaStorageClassstrAWS storage class the object is in the Orca catalog.

The API returns status code 200 on success, 400 if jobId or pageIndex are missing and 500 if an error occurs.

Internal Reconcile report mismatch API

The orca/datamanagement/reconciliation/internal/jobs/job/{jobid}/mismatches API call receives job id and page index from end user and returns reporting information of files that are missing from ORCA S3 bucket or have different metadata values than what is expected. Note that {jobid} is optional. Internal reconcile report mismatch API input invoke URL example: https://example.execute-api.us-west-2.amazonaws.com/orca/datamanagement/reconciliation/internal/jobs/job/{jobid}/mismatches

Internal Reconcile report mismatch API input

An example of the API input body is shown below:

{
"jobId": 123,
"pageIndex": 0
}

The following table lists the fields in the input:

NameData TypeDescriptionRequired
jobIdintThe unique job ID of the reconciliation job.Yes
pageIndexintThe 0-based index of the results page to return.Yes

Internal Reconcile report mismatch API output

An example of the API output is shown below:

{
"jobId": 123,
"anotherPage": false,
"mismatches": [
{
"collectionId": "MOD09GQ___061",
"granuleId": "MOD09GQ.A2017025.h21v00.006.2017034065109",
"filename": "MOD09GQ.A2017025.h21v00.006.2017034065109.hdf",
"keyPath": "MOD09GQ/006/MOD09GQ.A2017025.h21v00.006.2017034065109.hdf",
"cumulusArchiveLocation": "cumulus-public",
"orcaEtag": "d41d8cd98f00b204e9800998ecf8427",
"s3Etag": "1f78ve1d3f41vbhg4nbb4kjhong4x14",
"orcaGranuleLastUpdate": 1654878715868,
"s3FileLastUpdate": 1654878716000,
"orcaSizeInBytes": 6543277389,
"s3SizeInBytes": 1987618731,
"orcaStorageClass": "GLACIER",
"s3StorageClass": "GLACIER",
"discrepancyType": "etag, size_in_bytes",
"comments": null
}
]
}

The following table lists the fields in the output:

NameData TypeDescription
jobIdstrThe unique ID of the reconciliation job.
anotherPageBooleanIndicates if more results can be retrieved on another page.
mismatchesArray[Object]An array representing each mismatch if available.
collectionIdstrCumulus Collection ID value from the ORCA catalog.
granuleIdstrCumulus granuleID value from the ORCA catalog.
filenamestrFilename of the object from the ORCA catalog.
keyPathstrkey path and filename of the object in the ORCA catalog.
cumulusArchiveLocationstrExpected S3 bucket the object is located in Cumulus. From the ORCA catalog.
orcaEtagstretag of the object as reported in the ORCA catalog.
s3Etagstretag of the object as reported in the S3 bucket
orcaGranuleLastUpdateintThe time, in milliseconds since 1 January 1970 UTC, of last update of the object as reported in the ORCA catalog.
s3FileLastUpdateintThe time, in milliseconds since 1 January 1970 UTC, that information was updated in the S3 bucket.
orcaSizeInBytesintSize in bytes of the object as reported in the ORCA catalog.
s3SizeInBytesintSize in bytes of the object as reported in the S3 bucket.
orcaStorageClassstrAWS storage class the object is in the Orca catalog.
s3StorageClassstrAWS storage class the object is in the S3 bucket.
discrepancyTypestrType of discrepancy found during reconciliation.
commentstrAny additional context for the mismatch.

The API returns status code 200 on success, 400 if jobId or pageIndex are missing and 500 if an error occurs.