Versioned S3 Artifacts

This document will guide you through a pipeline modeled on a fairly common real-world use case of pushing tested, built artifacts into S3 buckets.

The end-to-end scenario is to monitor a Git repository for commits, and when new commits are detected, run its unit tests.

If the unit tests pass, the pipeline will then create a new release candidate artifact with automated versioning, which then will be placed in a S3 bucket.

From there, the pipeline will run integration tests against the release candidate, and if those pass, it will create a final artifact and "ship it" by putting it in a different S3 bucket.

The resulting pipeline will look like this:

First, we'll define our resources. These are the objects used in our pipeline. The resources configuration simply enumerates each of their locations.

resources:

Our first resource will be the location of our product's source code. Let's pretend it lives in a Git repo, and so we'll use the git resource type.

The git resource type requires two source parameters: uri and branch. We're using a SSH URI, so we'll also need to specify private_key.

To avoid embedding credentials in the pipeline config, we'll use a parameter.

- name: my-product
  type: git
  source:
    uri: git@github.com:my-user/my-product.git
    branch: master
    private_key: ((my-product-github-private-key))

We'll need a resource to represent the semantic version of our product, which we'll use to generate release candidates, and bump every time we ship. For this we'll use the semver resource type.

Currently, semver resources keep track of the version as a file in a S3 bucket, so we'll need to specify the credentials for the bucket, and a name for the file.

If your product already has a version number, you can specify it as initial_version. If not specified, the version will start as 0.0.0.

- name: version
  type: semver
  source:
    bucket: my-product-pipeline-artifacts
    key: current-version
    access_key_id: ((s3-access-key-id))
    secret_access_key: ((s3-secret-access-key))
    initial_version: 1.0.0

Let's define the resource for storing our product's release candidate artifacts generated by the pipeline. This is done with the s3 resource type.

The s3 resource type is minimally configured with a bucket name and a regexp, which will be used to match files in the bucket and order them by the version number extracted by the first capture group.

Since we'll be writing objects into this bucket, we'll need to configure it with AWS credentials.

- name: my-product-rc
  type: s3
  source:
    bucket: my-product-pipeline-artifacts
    regexp: my-product-(.*).tgz
    access_key_id: ((s3-access-key-id))
    secret_access_key: ((s3-secret-access-key))

We'll need one more s3 resource to represent shipped artifacts.

- name: my-product-final
  type: s3
  source:
    bucket: my-product
    regexp: my-product-(.*).tgz
    access_key_id: ((s3-access-key-id))
    secret_access_key: ((s3-secret-access-key))

Now that we've got all our resources defined, let's move on define the functions to apply to them, as represented by jobs

jobs:

Our first job will run the unit tests for our project. This job will fetch the source code, using the get step with the my-product resource, and execute the Task configuration file living in the repo under ci/unit.yml using a task step.

We set trigger: true on the get step so that it automatically triggers a new unit build whenever new commits are pushed to the my-product repository.

- name: unit
  plan:
  - get: my-product
    trigger: true
  - task: unit
    file: my-product/ci/unit.yml

Our pipeline now does something! But we're not quite delivering artifacts yet.

Let's consider anything making it past the unit tests to be a candidate for a new version to ship. We'll call the job that builds candidate artifacts build-rc.

Because this job makes modifications to our product version, we'll want to make sure it doesn't run concurrently with anything else doing the same thing. Otherwise we may generate versions or release candidates out of order.

This is done by specifying serial_groups, which is a list of arbitrary tags. We'll make sure to list the same tags in the other jobs which modify the version.

- name: build-rc
  serial_groups: [version]
  plan:

First, let's be sure to only grab versions of my-product that have passed unit tests. Let's have new occurrences of these versions also trigger new builds, while we're at it.

  - get: my-product
    passed: [unit]
    trigger: true

We'll also need a new release candidate version number. For this, the semver resource type can be used to generate versions by specifying params in the get step.

Specifying pre: rc makes it so that if the current version is e.g. 1.2.3-rc.3, we'll get 1.2.3-rc.4.

  - get: version
    params: {pre: rc}

Now, we'll execute our build-artifact task configuration, which we'll assume has two inputs (my-product and version) and produces a file named my-product-VERSION.tgz in an output called built-artifact when executed.

  - task: build-artifact
    file: my-product/ci/build-artifact.yml

Now that we have a tarball built, let's put it up to the pipeline artifacts S3 bucket via the my-product-rc resource defined above.

Note that we refer to the task that generated the .tgz in the path specified by the from param.

  - put: my-product-rc
    params: {file: built-artifact/my-product-*.tgz}

We'll also need to push up the newly bumped version number, so that next time we bump it'll be based on this new one.

Note that the file param points at the version created by the version step above.

  - put: version
    params: {file: version/number}

Now we're cooking with gas. But still, we haven't shipped any actual versions of the project yet: only candidates! Let's move on to the later stages in the pipeline.

Let's assume there's some more resource-intensive integration suite that uses our product, as a black-box. This will be the final set of checks and balances before shipping actual versions.

Let's assume this suite has to talk to some external environment, and so we'll configure the job with serial: true here to prevent concurrent builds from polluting each other.

- name: integration
  serial: true
  plan:

For the integration job, we'll need two things: the candidate artifact, and the repo that it came from, which contains all our CI scripts.

Note that this usage of passed guarantees that the two versions of my-product and my-product-rc respectively came out from the same build of build-rc. See get for more information.

  - get: my-product-rc
    trigger: true
    passed: [build-rc]
  - get: my-product
    passed: [build-rc]

We'll now run the actual integration task. Since it has to talk to some external environment, we'll use params to forward its credentials along to the task. See task for more information.

Again we'll use parameters in the config file to prevent hardcoding them.

  - task: integration
    file: my-product/ci/integration.yml
    params:
      API_ENDPOINT: ((integration-api-endpoint))
      ACCESS_KEY: ((integration-access-key))

At this point in the pipeline we have artifacts that we're ready to ship. So let's define a job that, when manually triggered, takes the latest candidate release artifact and publishes it to the S3 bucket containing our shipped product versions.

We'll call the job shipit. Since it'll also be modifying the version, we'll place it in the same serial group we specified for build-rc.

- name: shipit
  serial_groups: [version]
  plan:

Similar to the integration job, we'll once again need both our source code and the latest release candidate, this time having passed integration together.

Note that we have not specified trigger: true this time - this is because with a typical release-candidate pipeline, the shipping stage is only ever manually kicked off.

  - get: my-product-rc
    passed: [integration]
  - get: my-product
    passed: [integration]

Now we'll need to determine the final version number that we're about to ship. This is once again done by specifying params when fetching the version.

This time, we'll only specify bump as final. This means "take the version number and chop off the release candidate bit."

  - get: version
    params: {bump: final}

Next, we'll need to convert the release candidate artifact to a final version.

This step depends on the type of product you have; in the simplest case it's just a matter of renaming the file, but you may also have to rebuild it with the new version number, or push dependent files, etc.

For the purposes of this example, let's assume we have a magical task that does it all for us, and leaves us with a file called my-product-VERSION.tgz in a built-product output, just as with the build-rc job before.

  - task: promote-to-final
    file: my-product/ci/promote-to-final.yml

And now for the actual shipping!

  - put: my-product-final
    params: {file: built-product/my-product-*.tgz}
  - put: version
    params: {file: version/number}

This is all well and good, but you may have noticed it's only good for shipping one version. We only ever go from release candidates to final versions, and never do any actual version bumps!

Bumping your product's version is very much a human decision, so for this example we'll just assume the product manager will come in and decide what the next version should be at some point.

The simplest way to implement this is to have three jobs: one for doing a major bump, one for doing a minor bump, and one for doing a patch bump. These three jobs will do both the bump itself, and immediately bump to -rc.1 of the new version. This is done by specifying both bump and pre params.

The major job simply has a put for bumping the version in-place. Because it's modifying the version, we'll use serial_groups to ensure it doesn't run concurrently with build-rc, shipit, and the other bump jobs.

- name: major
  serial_groups: [version]
  plan:
  - put: version
    params: {bump: major, pre: rc}

The minor job is basically the same, but with bump: minor instead, unsurprisingly.

- name: minor
  serial_groups: [version]
  plan:
  - put: version
    params: {bump: minor, pre: rc}

The patch job will follow the same approach, but with one twist: we want to immediately bump to the next patchlevel release candidate after shipping.

This is so the pipeline can start generating candidates for a new version without requiring the product manager to decide the version to target next. We do a patch bump just because it's the most conservative bump we can make for the next release before knowing what'll be in it.

We'll have the patch job auto-trigger by having a dummy get action that depends on something having made it through the shipit job, with trigger: true. We'll use the version resource for this since it's the smallest thing coming out of the shipit job.

- name: patch
  serial_groups: [version]
  plan:
  - get: version
    passed: [shipit]
    trigger: true
  - put: version
    params: {bump: patch, pre: rc}