Stop env blocking.

A DevOps constraint that many teams struggle with is providing self-service application environments while keeping them consistent. The two goals, much like the two opposing forces of Development and Operations teams, conflict with each other. Self-service means anyone authorized can utilize or provision an environment, and consistent means environments must be the same (to ensure apps tested in lower environments will function in prod).

I often see the two principles manifest with an inverse relationship. Either environments are highly self-service, but there is no consistency between them, or they are incredibly consistent and lacking self-service efficiency or capability. Environments should be both highly self-serviceable and consistent. So how do we do this? Get rid of static staging environments and deploy every merge to prod. Here are some reasons why:

Maintain parity with prod: Once a PR is complete, it goes out the door. There’s no room for discrepencies between many lower, static enviroments and prod.
No waiting for envs: A new environment is created for you and your app deployed as soon as you create the PR.
Everyone owns their merges: Once you merge, it’s very clear your work is done. It’s also clear who did what and where!
Prove deployment end to end: The whole thing from start to finish is deployed with your new changes, proving your changes work with what’s already out there.

Here’s an example.

I’m deploying a single-page app to a storage account in Azure using terraform. Here’s a PR comment with some cool stuff:

Preview Summary

The workflow for the PR created an environment based on my branch name and deployed the theoretical app within. The pipeline ran in just over a minute and now I know exactly how my app is going to function when I merge the pull request and prod is deployed. Here’s what the terraform looks like:

TF Workspace Archetype

terraform {
  required_providers {
    azurerm = {
      source = "hashicorp/azurerm"
    }
  }
  backend "azurerm" {
    resource_group_name  = "rg-spa-init-eus"
    storage_account_name = "saspastate"
    container_name       = "tfstate"
    use_oidc             = true
    subscription_id      = "***"
    tenant_id            = "***"
  }
}

# Configure the Microsoft Azure Provider
provider "azurerm" {
  use_oidc = true
  features {}
}

module "infrastructure" {
  source      = "../infrastructure"
  service     = var.service
  environment = var.environment
  region      = var.region
  domain      = var.domain
}

Note: You can find the code for this experiment here.

By the way, I write one of these every few weeks or so. Sign-up with your email here to receive the latest, as soon as it’s posted!
Email Address:

Instead of using static environments within github actions, this same terraform workspace is called for each environment. The environment name is set by the GITHUB_HEAD_REF variable (the name of our branch to be merged). This value is used to point terraform at the proper state file and set the environment variable for the module. The module contains all of our resources, so each environment is identical:

resource "azurerm_resource_group" "main" {
  name     = "rg-${var.service}-${var.environment}-${var.region.suffix}"
  location = var.region.name
  tags = {
    app         = "${var.service}"
    environment = "${var.environment}"
    created-by  = "terraform"
  }
}
resource "azurerm_storage_account" "main" {
  name                     = format("%s", lower(replace("sa${var.service}${var.environment}", "/[[:^alnum:]]/", "")))
  resource_group_name      = azurerm_resource_group.main.name
  location                 = azurerm_resource_group.main.location
  account_tier             = "Standard"
  account_replication_type = "LRS"
  network_rules {
    default_action = "Allow"
  }
  static_website {
    index_document = "index.html"
  }
  tags = {
    app         = "${var.service}"
    environment = "${var.environment}"
    created-by  = "terraform"
  }
}
resource "azurerm_storage_container" "main" {
  name                  = "${var.service}${var.environment}"
  storage_account_name  = azurerm_storage_account.main.name
  container_access_type = "container"
}
data "azurerm_resource_group" "init" {
  name = "rg-${var.service}-init-${var.region.suffix}"
}
resource "azurerm_dns_cname_record" "main" {
  name                = var.environment
  zone_name           = "${var.service}.${var.domain}"
  resource_group_name = data.azurerm_resource_group.init.name
  ttl                 = 300
  record              = azurerm_storage_account.main.primary_web_host
  tags = {
    app         = "${var.service}"
    environment = "${var.environment}"
    created-by  = "terraform"
  }
}

When the PR is merged, the same actions are run, except against the “prod” environment. This apply will commit the actions described the “Prod Plan” you can see in the PR comment. So this plan:

Prod Plan

Becomes this:

Prod Deploy

Consistent. Self-Service. I like it.

Why do I need this?

Using Terraform Modules in this way is most useful for environments containing many small, atomic pieces. For example, a single-page app that lives in a storage account, with an API running on App Service in Azure. Since there are many little pieces to manage here (app service plan, app service, storage account, container registry/repository, web app load balancer/apim, and dns), it’s nice to modularize them for reuse.

Where doesn’t this make sense?

Terraform goes out the window for this as soon as you’re using an orchestrator or native platform features for environments. Microservices on kubernetes or nomad, for example. Here you are likely to be using helm charts, argo cd, nomad job specs, or waypoint to create dynamic environments for merge requests and prod deploys. You’d likely still use Terraform or another infrastructure as code tool to bootstrap and manage the underlying infrastructure supporting the orchestrator, following GitOps principles. Effectively, you can build your own Internal Developer Platform (IDP) this way.

Platform tools like Vercel, Netlify, Heroku, etc. have built-in functionality for merge/pull request deploys with your favorite git platform. Things are really easy here, since there is no backend infrastructure for you to manage. So if you’re using those tools, you’re all set. I’ll assume you’re just here for fun..

How to do it.

Take inventory of all the pieces your application environment needs. We’ll use our single-page app from earlier as an example. To simplify, we’ll deploy a storage account and DNS records:

azurerm_resource_group
azurerm_storage_account
azurerm_storage_container
azurerm_dns_zone
azurerm_dns_a_record

To deploy what we want, we use conditionals to run certain terraform tasks when the pipeline is triggered for a pull request run, and others when there is a push to main (PR merge). When these terraform actions run they point at the archetypal environment workspace. The workspace archetype includes default variable values in variables.tf that apply to all environments. The rest we fill in during runs of the GitHub Actions workflow, when calling the archetype. Speaking of, here’s mine:

name: "Single-page App Deployment with Dynamic Environments"

on:
  push:
    branches:
      - main
  pull_request:

jobs:
  tf-ci:
    name: "Terraform PR Validation"
    runs-on: ubuntu-latest
    environment: azure
    env:
      ARM_CLIENT_ID: ${{ secrets.ARM_CLIENT_ID }}
      ARM_SUBSCRIPTION_ID: ${{ secrets.ARM_SUBSCRIPTION_ID }}
      ARM_TENANT_ID: ${{ secrets.ARM_TENANT_ID }}
      TF_LOG: INFO
    permissions:
      pull-requests: write
      id-token: write
      contents: read
    defaults:
      run:
        working-directory: ./tf/envdna

    steps:
      - name: Checkout
        uses: actions/checkout@v3
      
      - name: Setup Terraform
        uses: hashicorp/setup-terraform@v1
        with:
          terraform_version: 1.3.7

      - name: Terraform Format
        id: fmt
        run: terraform fmt -check

      - name: Terraform Init
        id: init
        run: terraform init -backend-config='key=prod.tfstate'
      
      - name: Terraform Validate
        id: validate
        run: terraform validate -no-color

      - name: Terraform Plan
        id: plan
        if: github.event_name == 'pull_request'
        run: terraform plan -no-color -input=false -var 'environment=prod'
        continue-on-error: true
      
      - name: Terraform Init
        id: initpr
        if: github.event_name == 'pull_request'
        run: terraform init -backend-config='key=${{ github.head_ref }}.tfstate' -reconfigure
      
      - name: Terraform Validate
        id: validatepr
        if: github.event_name == 'pull_request'
        run: terraform validate -no-color

      - name: Terraform Review Apply
        id: apply
        if: github.event_name == 'pull_request'
        run: terraform apply -auto-approve -no-color -input=false -var 'environment=${{ github.head_ref }}'
        continue-on-error: true

      - name: Terraform Output
        id: tfout
        if: github.event_name == 'pull_request'
        run:  terraform output -raw preview | sed 's/.$//'

      - name: Update Pull Request
        uses: actions/github-script@v6
        if: github.event_name == 'pull_request'
        env:
          PLAN: ${{ steps.plan.outputs.stdout }}
        with:
          github-token: ${{ secrets.GITHUB_TOKEN }}
          script: |
            const output = `#### Terraform Format and Style 🖌\`${{ steps.fmt.outcome }}\`
            #### Terraform Initialization ⚙️\`${{ steps.init.outcome }}\`
            #### Terraform Validation 🤖\`${{ steps.validate.outcome }}\`
            #### Terraform Prod Plan 📖\`${{ steps.plan.outcome }}\`
            #### Preview URL 🌎 [https://${{ steps.tfout.outputs.stdout }}](https://${{ steps.tfout.outputs.stdout }})
            <details><summary>Show Prod Plan</summary>\n
            \`\`\`\n
            ${process.env.PLAN}
            \`\`\`
            </details>
            Pushed by: @${{ github.actor }}, Action: ${{ github.event_name }}`;
            github.rest.issues.createComment({
              issue_number: context.issue.number,
              owner: context.repo.owner,
              repo: context.repo.repo,
              body: output
            })
      - name: Terraform Plan Status
        if: steps.plan.outcome == 'failure'
        run: exit 1

      - name: Terraform Apply
        if: github.ref == 'refs/heads/main' && github.event_name == 'push'
        run: terraform apply -auto-approve -input=false -var 'environment=prod'

The flow.

Put it all together and you have a nice workflow:

Open a PR on your repo with your latest app changes. You get the plan output for prod and terraform creates an environment with your proposed changes.
Hit the preview URL and drive around to validate your new changes. Send this preview link to others if you like!
Merge the PR and prod deploys with your changes.

Some other things.

Of course, to set this up, there are a couple of other things you’d need to do:

Create a service principal in your Azure AD directory and assign it the contributor/owner role on your Azure subscription. You can setup federated credentials for passwordless auth using the guide here.
Run a Terraform apply locally for the “init” environment, which creates any shared resources you may have (like the dns zone, in our example). After running the initial apply, migrate your local state file into the azure storage account this creates by running a terraform init.
Add any steps for building, publishing, and deploying your app code to the GitHub Actions workflow. It’s likely you’ll need some other terraform resources as well, depending on your setup.
Here is the source code for this experiment.

If you need help with DevOps and software delivery, reach out.

Here’s an example.#

TF Workspace Archetype#

Why do I need this?#

Where doesn’t this make sense?#

How to do it.#

The flow.#

Some other things.#