How to Deploy Cumulus

Overview

This is a guide for deploying a new instance of Cumulus.

The deployment documentation is current for the following component versions:

The process involves:

  • Creating AWS S3 Buckets.
  • Using Kes to transform kes templates (cloudformation.template.yml) into AWS CloudFormation stack templates (cloudformation.yml) that are then deployed to AWS.
  • Before deploying the Cumulus software, CloudFormation stacks are deployed that create necessary IAM roles via the deployer and iams stacks.
  • The Cumulus software is configured and deployed via the app stack.

Requirements

Linux/MacOS software requirements:

Credentials:

  • CMR username and password. Can be excluded if you are not exporting metadata to CMR.

  • EarthData Client login username and password. User must have the ability to administer and/or create applications in URS. It's recommended to obtain an account in the test environment (UAT).

Needed Git Repositories:

Installation

Make local copy of Cumulus Repo and prepare it.

Clone repository

$ git clone https://github.com/cumulus-nasa/cumulus.git

Change directory to the repository root

$ cd cumulus

Optionally, If you are deploying a particular version(tag), ref or branch of Cumulus core, you should check out that particular reference

$ git checkout \<ref/branch/tag\>

Install and configure the local build environment and dependencies using npm

$ nvm use
$ npm install
$ npm run ybootstrap

Build the Cumulus application

$ npm run build

Prepare DAAC deployment repository

If you already are working with an existing <daac>-deploy repository that is configured appropriately for the version of Cumulus you intend to deploy or update, skip to Prepare AWS configuration.

Clone template-deploy repo and name appropriately for your DAAC or organization

$ git clone https://github.com/cumulus-nasa/template-deploy <daac>-deploy

Enter repository root directory

$ cd <daac>-deploy

Packages are installed with npm. A list of Cumulus packages with descriptions and version information can be found here.

If you're trying to work with a certain version of a cumulus package or task, the version can be specified in package.json under dependencies. We use semantic versioning (major/minor/patch). You can also configure for automatic updates. Use ^ to update minor/patch versions automatically and ~ to automatically update patch versions. For example:

"@cumulus/sync-granule": "^1.0.0"

Then run:

$ npm install

To add a new package, install via npm. Without a version specified, it will automatically install the latest version. For example:

$ npm install @cumulus/deployment

Note: The npm install command will add the kes utility to the <daac>-deploy's node_packages directory and will be utilized later for most of the AWS deployment commands

To use the specific version of the package installed during deployment, point the source key in the lambda config to node_modules/@cumulus/<package-name>/dist. This location may vary between packages, so consult the README in each. For example:

SyncGranule:
  source: 'node_modules/@cumulus/sync-granule/dist/'

The Cumulus project contains default configuration values in cumulus/packages/deployment/app.example, however these need to be customized for your Cumulus app.

Copy the sample template into your repository

Begin by copying the template directory to your project. You will modify it for your DAAC's specific needs later.

$ cp -r ../cumulus/packages/deployment/app.example ./app

Optional: Create a new repository <daac>-deploy so that you can track your DAAC's configuration changes:

$ git remote set-url origin https://github.com/cumulus-nasa/<daac>-deploy
$ git push origin master

You can then add/commit changes as needed.

Prepare AWS configuration

Set Access Keys:

You need to make some AWS information available to your environment. If you don't already have the access key and secret access key of an AWS user with IAM Create-User permissions, you must Create Access Keys for such a user with IAM Create-User permissions, then export the access keys:

$ export AWS_ACCESS_KEY_ID=<AWS access key>
$ export AWS_SECRET_ACCESS_KEY=<AWS secret key>
$ export AWS_REGION=<region>

If you don't want to set environment variables, access keys can be stored locally via the AWS CLI.

Create S3 Buckets:

See creating s3 buckets for more information on how to create a bucket.

The following s3 bucket should be created (replacing prefix with whatever you'd like, generally your organization/DAAC's name):

  • <prefix>-internal

You can create additional s3 buckets based on the needs of your workflows.

These buckets do not need any non-default permissions to function with Cumulus, however your local security requirements may vary.

Note: s3 bucket object names are global and must be unique across all accounts/locations/etc.


Configure and Deploy the IAM stack

Configure deployment with <daac>-deploy/iam/config.yml

The iam configuration creates 4 roles and an instance profile used internally by the Cumulus stack.

The various config fields are described below with a sample config.yml at the end.


iam-deployment-name:

The name (e.g. dev) of the the 'deployment' - this key tells kes which configuration set (in addition to the default values) to use when creating the cloud formation template4

prefix:

This value will prefix CloudFormation-created IAM resources and permissions.

The cumulus stack name must start with <prefix> 5

stackName:

The name of this iam stack in CloudFormation (e.g. -iam).

buckets:

The buckets created in the Create S3 Buckets step.


Sample new deployment added to config.yml:

<iam-deployment-name>:          # e.g. dev (Note: Omit brackets, i.e. NOT <dev>)
  prefix: <stack-prefix>  # prefixes CloudFormation-created iam resources and permissions
  stackName: <stack-name> # name of this iam stack in CloudFormation (e.g. <prefix>-iams)
  buckets:
    internal: <prefix>-internal  # Note: these are the bucket names, not the prefix from above

Deploy iam stack1

$ kes cf deploy --kes-folder iam --deployment <iam-deployment-name> --template node_modules/@cumulus/deployment/iam --region <region>

Note: If this deployment fails check the deployment details in the AWS Cloud Formation Console for information. Permissions may need to be updated by your AWS administrator.

If the iam deployment command succeeds, you should see 4 new roles in the IAM Console:

  • <stack-name>-ecs
  • <stack-name>-lambda-api-gateway
  • <stack-name>-lambda-processing
  • <stack-name>-steprole

The same information can be obtained from the AWS CLI command: aws iam list-roles.

The iam deployment also creates an instance profile named <stack-name>-ecs that can be viewed from the AWS CLI command: aws iam list-instance-profiles.

Update AWS Access Keys

Create or obtain Access Keys for the user who will assume the DeployerRole in IAM, then export the access keys, replacing the previous values in your environment:

$ export AWS_ACCESS_KEY_ID=<AWS access key>
$ export AWS_SECRET_ACCESS_KEY=<AWS secret key>
$ export AWS_REGION=<region>

If you don't want to set environment variables, access keys can be stored locally via the AWS CLI..

Make sure you've updated your actual environment variables before proceeding (e.g., if sourcing from a file, re-source the file).


Configure and Deploy the Cumulus stack

These updates configure the copied template from the cumulus repository for your DAAC.

You should either add a new root-level key for your configuration or modify the existing default configuration key to whatever you'd like your new deployment to be.

If you're re-deploying based on an existing configuration you can skip this configuration step unless values have been updated or you'd like to add a new deployment to your deployment configuration file.

Edit the <daac>-deploy/app/config.yml file

The various configuration sections are described below with a sample config.yml at the end:


cumulus-deployment-name:

The name (e.g. dev) of the the 'deployment' - this key tells kes which configuration set (in addition to the default values) to use when creating the cloud formation template4

stackName:

The name of this stack in CloudFormation (e.g. ). This stack name must start with the prefix listed in the IAM role configuration, or the deployment will fail.

stackNameNoDash:

A representation of the stack name that has dashes removed. This will be used for components that should be associated with the stack but do not allow dashes in the identifier.

vpc

Configure your virtual private cloud. You can find <vpc-id> and <subnet-id> values on the VPC Dashboard. vpcId from Your VPCs, and subnets here. When you choose a subnet, be sure to also note its availability zone, to configure ecs.

ecs

Configuration for the Amazon EC2 Container Service (ECS) instance. Update availabilityZone (or availabilityZones if using multiple AZs) with information from VPC Dashboard note instanceType and desiredInstances have been selected for a sample install. You will have to specify appropriate values to deploy and use ECS machines. See EC2 Instance Types for more information.

Also note, if you dont specify the amiid, it will try to use a default, which may or may not exist.

buckets

The config buckets should map to the same names you used when creating buckets in the Prepare AWS step.

iams

Add the ARNs for each of the four roles and one instanceProfile created in the Create IAM Roles step. You can retrieve the ARNs from:

$ aws iam list-roles | grep Arn
$ aws iam list-instance-profiles | grep Arn

For information on how to locate them in the Console see Locating Cumulus IAM Roles.

users

List of EarthData users you wish to have access to your dashboard application. These users will be populated in your <stackname>-UsersTable DynamoDb (in addition to the default_users defined in the Cumulus default template).


Sample config.yml
<cumulus-deployment-name>:
  stackName: <prefix>-cumulus
  stackNameNoDash: <Prefix>Cumulus

  apiStage: dev

  vpc:
    vpcId: <vpc-id>
    subnets:
      - <subnet-id>

  ecs:
    instanceType: t2.micro
    desiredInstances: 0
    availabilityZone: <subnet-id-zone>
    amiid: <some-ami-id>

  buckets:
    internal: <prefix>-internal

  iams:
    ecsRoleArn: arn:aws:iam::<aws-account-id>:role/<iams-prefix>-ecs
    lambdaApiGatewayRoleArn: arn:aws:iam::<aws-account-id>:role/<iams-prefix>-lambda-api-gateway
    lambdaProcessingRoleArn: arn:aws:iam::<aws-account-id>:role/<iams-prefix>-lambda-processing
    stepRoleArn: arn:aws:iam::<aws-account-id>:role/<iams-prefix>-steprole
    instanceProfile: arn:aws:iam::<aws-account-id>:instance-profile/<iams-prefix>-ecs

  urs_url: https://uat.urs.earthdata.nasa.gov/ #make sure to include the trailing slash

    # if not specified the value of the apigateway backend endpoint is used
    # api_backend_url: https://apigateway-url-to-api-backend/ #make sure to include the trailing slash

    # if not specified the value of the apigateway dist url is used
    # api_distribution_url: https://apigateway-url-to-distribution-app/ #make sure to include the trailing slash

  # URS users who should have access to the dashboard application.
  users:
    - username: <user>
    - username: <user2>
Configure EarthData application

The Cumulus stack is expected to authenticate with Earthdata Login. You must create and register a new application. Use the User Acceptance Tools (UAT) site unless you changed urs_url above. Follow the directions on how to register an application.. Use any url for the Redirect URL, it will be deleted in a later step. Also note the password in step 3 and client ID in step 4 use these to replace clientid and clientpassword in the .env file in the next step.

Set up an environment file:

If you're adding a new deployment to an existing configuration repository or re-deploying an existing Cumulus configuration you should skip to Deploy the Cumulus Stack, as these values should already be configured.

Copy app/.env.sample to app/.env and add CMR/earthdata client credentials:

CMR_PASSWORD=cmrpassword
EARTHDATA_CLIENT_ID=clientid
EARTHDATA_CLIENT_PASSWORD=clientpassword

For security it is highly recommended that you prevent apps/.env from being accidentally committed to the repository by keeping it in the .gitignore file at the root of this repository.

Deploy

Once the preceding configuration steps have completed, run the following to deploy Cumulus from your <daac>-deploy root directory:

$ ./node_modules/.bin/kes cf deploy --kes-folder app --region <region> \
  --template node_modules/@cumulus/deployment/app \
  --deployment <cumulus-deployment-name>

You can monitor the progess of the stack deployment from the AWS CloudFormation Console; this step takes a few minutes.

A successful completion will result in output similar to:

 $ ./node_modules/.bin/kes cf deploy --kes-folder app --region <region>
   --template node_modules/@cumulus/deployment/app --deployment daac
Generating keys. It might take a few seconds!
Keys Generated
keys uploaded to S3

  adding: sf-starter/ (stored 0%)
  adding: sf-starter/index.js (deflated 85%)


  adding: daac-ops-api/ (stored 0%)
  adding: daac-ops-api/index.js (deflated 85%)


  adding: sf-sns-broadcast/ (stored 0%)
  adding: sf-sns-broadcast/index.js (deflated 85%)


  adding: hello-world/ (stored 0%)
  adding: hello-world/index.js (deflated 85%)

Uploaded: s3://daac-internal/daac-cumulus/lambdas/<HASHNUMBERS>/hello-world.zip
Uploaded: s3://daac-internal/daac-cumulus/lambdas/<HASHNUMBERS>/sf-starter.zip
Uploaded: s3://daac-internal/daac-cumulus/lambdas/<HASHNUMBERS>/sf-sns-broadcast.zip
Uploaded: s3://daac-internal/daac-cumulus/lambdas/<HASHNUMBERS>/daac-ops-api.zip
Template saved to app/cloudformation.yml
Uploaded: s3://<prefix>-internal/<prefix>-cumulus/cloudformation.yml
Waiting for the CF operation to complete
CF operation is in state of CREATE_COMPLETE

Here are the important URLs for this deployment:

Distribution:  https://<kido2r7kji>.execute-api.us-east-1.amazonaws.com/dev/
Add this url to URS:  https://<kido2r7kji>.execute-api.us-east-1.amazonaws.com/dev/redirect

Api:  https://<czbbkscuy6>.execute-api.us-east-1.amazonaws.com/dev/
Add this url to URS:  https://<czbbkscuy6>.execute-api.us-east-1.amazonaws.com/dev/token

Uploading Workflow Input Templates
Uploaded: s3://<prefix>-internal/<prefix>-cumulus/workflows/HelloWorldWorkflow.json
Uploaded: s3://<prefix>-internal/<prefix>-cumulus/workflows/list.json

Note: Be sure to copy the urls, as you will use them to update your EarthData application.

Update Earthdata Application.

You will need to add two redirect urls to your EarthData login application. Login to URS (UAT), and under My Applications -> Application Administration -> use the edit icon of your application. Then under Manage -> redirect URIs, add the Backend API url returned from the stack deployment, e.g. https://<czbbkscuy6>.execute-api.us-east-1.amazonaws.com/dev/token. Also add the Distribution url https://<kido2r7kji>.execute-api.us-east-1.amazonaws.com/dev/redirect3. You may also delete the placeholder url you used to create the application.

If you've lost track of the needed redirect URIs, they can be located on the API Gateway. Once there select <prefix>-backend and/or <prefix>-distribution, Dashboard and utilizing the base URL at the top of the page that is accompanied by the text Invoke this API at:. Make sure to append /token for the backend URL and /redirect to the distribution URL.


Deploy Cumulus dashboard

Prepare AWS

Create S3 bucket for dashboard:

  • Create it, e.g. <prefix>-dashboard. Use the command line or console as you did when preparing AWS configuration.
  • Configure the bucket to host a website:
    • AWS S3 console: Select <prefix>-dashboard bucket then, "Properties" -> "Static Website Hosting", point to index.html
    • CLI: aws s3 website s3://<prefix>-dashboard --index-document index.html
  • The bucket's url will be http://<prefix>-dashboard.s3-website-<region>.amazonaws.com or you can find it on the AWS console via "Properties" -> "Static website hosting" -> "Endpoint"
    • Ensure the bucket's access permissions allow your deployment user access to write to the bucket

Install dashboard

To install the dashboard clone the Cumulus-dashboard repository into the root deploy directory and install dependencies with npm install:

$ git clone https://github.com/cumulus-nasa/cumulus-dashboard
$ cd cumulus-dashboard
$ npm install

Dashboard configuration

Configure dashboard:

Update config in cumulus-dashboard/app/scripts/config/config.js:

replace the default apiRoot https://wjdkfyb6t6.execute-api.us-east-1.amazonaws.com/dev/ with your app's apiroot.2

apiRoot: process.env.APIROOT || 'https://<czbbkscuy6>.execute-api.us-east-1.amazonaws.com/dev/'

Note environmental variables are available during the build: APIROOT, DAAC_NAME, STAGE, HIDE_PDR, any of these can be set on the command line to override the values contained in config.js when running the build below.

Build the dashboard from the dashboard repository root directory, cumulus-dashboard:

  $ npm run build

Dashboard deployment:

Deploy dashboard to s3 bucket from the cumulus-dashboard directory:

Using AWS CLI:

  $ aws s3 sync dist s3://<prefix>-dashboard --acl public-read

From the S3 Console:

  • Open the <prefix>-dashboard bucket, click 'upload'. Add the contents of the 'dist' subdirectory to the upload. Then select 'Next'. On the permissions window allow the public to view. Select 'Upload'.

You should be able to visit the dashboard website at http://<prefix>-dashboard.s3-website-<region>.amazonaws.com or find the url <prefix>-dashboard -> "Properties" -> "Static website hosting" -> "Endpoint" and login with a user that you configured for access in the Configure Cumulus Stack step.


Updating Cumulus deployment

Once deployed for the first time, any future updates to the role/stack configuration files/version of Cumulus can be deployed and will update the appropriate portions of the stack as needed.

Update roles

$ kes cf deploy --kes-folder iam --deployment <deployment-name> \
  --region <region> # e.g. us-east-1

Cumulus Versioning

Cumulus uses a global versioning approach, meaning version numbers are consistent across all packages and tasks, and semantic versioning to track major, minor, and patch version (i.e. 1.0.0). We use Lerna to manage versioning. Any change will force lerna to increment the version of all packages.

Publishing to NPM

$ lerna publish

To specify the level of change for the new version

$ lerna publish --cd-version (major | minor | patch | prerelease)

Update Cumulus

$ kes cf deploy --kes-folder config --region <region> \
  --deployment <deployment-name>

Footnotes:

1. The iam actions require more permissions than a typical AWS user will have and should be run by an administrator.
2. The API root can be found a number of ways. The easiest is to note it in the output of the app deployment step. But you can also find it from the AWS console -> Amazon API Gateway -> APIs -> <prefix>-cumulus-backend -> Dashboard, and reading the url at the top "invoke this API"
3. To add another redirect URIs to your application. On EarthData home page, select "My Applications" Scroll down to "Application Administration" and use the edit icon for your application. Then Manage -> Redirect URIs.
4. This value is used by kes only to identify the configuration set to use and should not appear in any AWS object
5. For more on the AWS objects this impacts, you can look through iam/cloudformation.template.yml

results matching ""

    No results matching ""