Upgrading to Cumulus 1.6.0
Cumulus 1.6 is bringing a lot of new functionality for configuring how granules are stored. Configuration updates are required to support this new functionality. The Cumulus example deployment has been updated to run and test all of these changes and should be used as an example when updating workflow configurations. Specific examples are linked below.
- Using metadata to determine granule storage location in S3.
- Instead of being limited to parameters defined in config.yml or hardcoded parameters, the S3 location can now be dynamically determined based on the CMR metadata.
- Can configure paths for individual files or for the overall collection
- Ability to extract parts of metadata dates for storage location - i.e. can organize folders in the S3 bucket by granule month
- Ability to substring metadata to configure storage location
- Configuring multiple buckets of the same type, i.e. multiple protected buckets and configuring different files to go in different buckets
Upgrade kes version to 2.2.2
An IAM template is now in the deployment package. To use this template when deploying IAMs, add
--template node_modules/@cumulus/deployment/iam to your deployment command (this is the same way app deployment is done - see deployment documentation). Make sure to redeploy your IAM roles.
Configure system_bucket in iam/config.yml and app/config.yml - kes will use this instead of buckets.internal and the system_bucket in the workflows will be populated with this value. (Example)
The structure of buckets has changed. It is no longer key/value, but has a key, bucket name, and level. The key can be used in collection configurations so if the bucket name changes, it can be updated in the configuration without having to update collection definitions. (Example)
The new bucket structure should be put in iam/config.yml and app/config.yml. (IAM config.yml example and app config.yml example)
Following deployment of 1.6.0, your distribution API Gateway URL will be changed and you will have to update your redirect URI on URS. Please note the output URLs in your deployment.
Workflow Configuration Changes
To support using metadata to configure storage locations, the Sync Granule task now writes to a staging location and the granule files are moved when the metadata becomes available. The MoveGranules task moves files to the correct location before Post to CMR. Add MoveGranules to your workflow before Post to CMR. Lambdas.yml and Workflows.yml should be updated for this change. (Example)
Post to CMR task:
- Process required
- Input_granules no longer needed
- Granules required. The granules object has been updated in the schema to include the expected format
- Output (no updates required)
- Process is now output
Sync Granules task:
- Stack parameter added - used for the file staging directory when none is specified
- fileStagingDir parameter added - this can be used to specify a custom staging directory
- Either stack of fileStagingDir should be part of the config
- Output (no updates required)
- fileStagingDir output
- Url_path output
Cumulus-process-py has been updated to accommodate granule files first being stored in a staging directory. You will need the latest version of cumulus-process-py.