Upgrading Your Secrets Management

With recent security breaches like Heroku’s GitHub keys being compromised or CircleCI’s environment variable leaks, we’re all asking ourselves how to better safeguard infrastructure secrets.

At incident.io, these events made us want a better story for our secrets. But while we use Google Cloud Platform (GCP) for most of our application services–and intend to move there fully, eventually–our app runs in Heroku with their container runtime, so it wasn’t immediately obvious what ‘upgrading’ would mean for us.

We’ve found a solution using Google Secret Manager that’s working well for us, with a small bit of application code to handle loading and managing of secrets. Here, we share a case study of our approach for anyone who’s considering a similar setup.

The Plan

Before this project, our application would load secrets from environment variables. As the app ran in Heroku, this meant we’d manage secrets via Heroku’s configuration tools, either in the dashboard UI or via heroku config:edit.

Application code that loaded these secrets looked like this:

î°ƒvar ZoomOAuthCredentials = struct {

ClientID     string

ClientSecretstring

}{

ClientID:     os.Getenv(“ZOOM_OAUTH_CLIENT_ID”),

ClientSecret:os.Getenv(“ZOOM_OAUTH_CLIENT_SECRET”),

}

î°‚

Super simple, but this meant:

  • If Heroku was compromised, so were our secrets.
  • Any access to the process environment would capture secrets, even if accidental and by trusted libraries like exception trackers.
  • Container vulnerabilities that allow reading of environment variables could extract secrets.

While you have to trust your infrastructure provider—in this case Heroku—we wanted to put secrets more than arms reach away from any of these situations. That meant moving secrets to a more secure location than Heroku config, and changing how we loaded/used the secrets in our app.

Storing Secrets

We try to run as little infrastructure as possible at incident.io and while I’ve built and run HA Vault clusters before, that wouldn’t be an ideal outcome for us.

Thankfully, GCP has a fully managed service called Google Secret Manager (GSM) which is a great fit for our requirements. For those unfamiliar: You’d use GSM to store secret values against a plaintext key (e.g. zoom-oauth-client-secret) and control access using GCP identity and access management (IAM).

Here’s a minimal example using the gcloud command line interface (CLI) to create, then read a secret:

î°ƒ$ gcloud secrets create example-key

Created secret [example-key].

$ gcloud secrets versions add example-key –data-file=- \\

    <<< “super-duper-secret”

Created version [1] of the secret [example-key].

$ gcloud secrets versions access –secret example-key 1

super-duper-secret

î°‚

If we can lift all our secrets (other than the Google service account key) out of Heroku config, we’ll benefit from Secret Manager’s much-improved security features and bring ourselves a step closer to a GCP-only future.

Loading Secrets

So, assuming secrets are now all in Secret Manager, how do we get our app to use them?

To be as simple and explicit as possible, we started by creating a config/config.go package in which we modeled all application config.

î°ƒpackage config

// Secret represents a sensitive config value and overrides String()

// and MarshalJSON() to prevent accidental logging.

type Secret string

// The loaded config that should be used by the app.

var CONFIG Config

// Config contains all application configuration parameters.

type Config struct {

  // …

  ZOOM_OAUTH_CLIENT_ID      string `config:”required”`

  ZOOM_OAUTH_CLIENT_SECRET  Secret `config:”required”`

  // …

}

î°‚

Consolidating app config was useful (aside from how it benefited this project), as it meant we could better document each key and have a single place to look for config changes. It’s something I’d advise for most big applications once you have a stable code structure in place.

But the key motivation was to allow writing a Load(configFile string) function that could load a configuration file (e.g. config/environments/staging.yml) that looks like this:

î°ƒ—

ZOOM_OAUTH_CLIENT_ID: some-client-id

ZOOM_OAUTH_CLIENT_SECRET:

secret-manager:projects/940789456123/secrets/app-zoom-oauth-client-secret/versions/1

î°‚

Each application environment would have its own config file, and Load() would build it into a config.Config structure by:

  1. Parsing the YAML config file into a Config struct
  2. For each Config field that is Secret and had a value with the secret-manager: prefix, make a request to Google Secret Manager and load that value into the struct field
  3. For any fields with a config:”required” struct tag, ensure they have been loaded to non-empty values

This becomes the first thing we do when booting the app, ensuring we initialize a config that is usable:

î°ƒ// Inside main.go and any other entrypoint.

if err = config.Init(ctx, configFile); err != nil {

panic(err)

}

î°‚

Once loaded, you’d use the config in application code, like so:

î°ƒvar ZoomOAuthCredentials = struct {

ClientID     string

ClientSecretstring

}{

ClientID:     config.CONFIG.ZOOM_OAUTH_CLIENT_ID,

ClientSecret:string(config.CONFIG.ZOOM_OAUTH_CLIENT_SECRET),

}

î°‚

Not only is this easy to follow, but it comes with developer experience benefits over just os.Getenv’ing such as type-safe config accesses and safer marshaling of secret values than we had before.

Managing Secret Values

With secrets in Secret Manager and our config.Load helper available to load them, the app is ready to consume secrets from GCP.

But that assumes the secrets are already there: In this setup, how do developers set the secret values?

Before we cover that process, it’s worth highlighting that our usage of Secret Manager in application code is extremely basic. Despite GSM supporting secret versions, our application config doesn’t need any concept of versioning because we pin each secret that we use to a specific version via the config file.

Note the /versions/1 suffix in the reference below:

î°ƒ# Example secret reference, including version:

ZOOM_OAUTH_CLIENT_SECRET:

secret-manager:projects/940789456123/secrets/app-zoom-oauth-client-secret/versions/1

î°‚

Besides being simpler—which is attractive in its own right—this means:

  • Each commit SHA of our app bundles together code and config versions.
  • We’re very loosely coupled to GSM as a store: We could move to Vault or any alternative without worrying about how each tool differs, as we make zero assumptions about versioning.
  • We encourage immutability of secrets by banning version deletion, which allows config changes to go via standard CI/CD processes and prevents the config changing underneath the app between code versions.

With this in mind, our ideal flow to update a secret is:

  1. Create a new version of the secret in GSM with the new value.
  2. Take a reference for that secret and update the config/environments/<env-name>.yml
  3. Pull request (PR) that change; merge to master for deployment.

To help developers with this flow, we created a small config CLI that can update the secret value following these rules and print the appropriate reference for updating the config.

It looks like this:

î°ƒ$ go run cmd/config/main.go create-secret –help

usage: config create-secret –project=PROJECT –field-name=FIELD-NAME

Create a new version of a secret in Google Secret Manager. Will

create secret if it does not already exist.

Flags:

  –project=PROJECT          Target Google project to create secret in

  –field-name=FIELD-NAME    Name of secret, should be the existing field name from the config struct.

$ go run cmd/config/main.go create-secret –field-name ZOOM_OAUTH_CLIENT_SECRET

opening vim…

2023/02/26 12:46:04 Secret app-zoom-oauth-client-secret does not exist, creating…

2023/02/26 12:46:04 Creating new version for secret ‘projects/940789456123/secrets/app-zoom-oauth-client-secret’…

2023/02/26 12:46:06 Succesfully created secret version

2023/02/26 12:46:06 Add the following entry to your environments config file (e.g. config/environments/<env>.yml):

ZOOM_OAUTH_CLIENT_SECRET: secret-manager:projects/940789456123/secrets/app-zoom-oauth-client-secret/versions/1

î°‚

This makes it easy to update the secret, and we use the same CLI in our continuous integration (CI) pipeline to check the reference is valid in Google Secret Manager, helping catch any typos along the way.

It’s worth saying this flow might be painful if your CI pipeline is slow. We regularly invest in our pipelines to keep them under five minutes which makes this more than acceptable, but we can always opt out of non-secret CI checks for PRs with only config/environment changes if the pipeline ever becomes unwieldy.

A Good Result

We’re really happy with this setup. For minimal effort invested in the config CLI and associated CI checks, we’ve got a setup that feels as smooth as our previous use of Heroku, but a lot safer. It also brings us closer to an immutable infrastructure model where the build SHA describes code and config together, which we’re big fans of.

There’s obviously more to our setup than what we’ve covered. As an example, we use GCP federated workload identity and OIDC tokens for Heroku GCP credentials, protecting ourselves even if someone lifts them from the Heroku environment. We’ve also built GCP Security Perimeters around the Secret Manager API that can detect unauthorized access and alert us immediately, further strengthening our setup against common intrusions.

But what we’ve shared here is an approach that can upgrade basic secret management even for those running infrastructure across multiple environments. If you’re looking for a similar upgrade, you can give this a try knowing additional layers are easy to add on top of these foundations.

Lawrence Jones

Lawrence spent his early career at GoCardless, watching the company grow from start-up to unicorn. Motivated by the impact you can have when solving hard technical challenges, he joined SRE to help migrate GoCardless to the cloud, eventually leading that team to build a PaaS that helped scale with their engineering headcount. Having joined incident.io as employee #1, his focus is now on establishing foundations to help the start-up build a world-class engineering team.

Lawrence Jones has 1 posts and counting. See all posts by Lawrence Jones