Speeding up CircleCI Builds with Caching

Speeding up CircleCI Builds with Caching

Redownloading dependencies for every step in your CI/CD pipeline can be time consuming. You can dramatically speed up the build time of your application with caching, making your team more responsive to breaking changes and ultimately more productive. Here's how to do it.

An image of a child staring at steps.

Why do we need faster deployments?

When building out a deployment pipeline for an application, it’s important to get feedback about the build quickly. Ultimately, a quicker pipeline leads to code reaching a production environment more frequently and in smaller chunks. This ensures rapid iteration for developers, who won’t be waiting around for a massive build before continuing on to the next feature.

The faster deployment cycle also facilitates better debugging and rollback. It’s much easier to debug code you’ve written three minutes ago, versus a half hour ago. And rolling back each small feature is effortless, compared to rolling back a massive, jumbled commit.

Our basic application and pipeline

Let’s say our simple NodeJS application relies on a few packages to function. It’s a simple Express application that reaches out to another API and provides JSON data back at a specific endpoint. This is the entire application:

index.js
const express = require("express");
const axios = require("axios");
 
const app = express();
const port = 3000;
 
app.get("/", async (req, res) => {
  try {
    const response = await axios.get(
      "https://jsonplaceholder.typicode.com/todos/1"
    );
    res.send(response.data);
  } catch (err) {
    res.status(500);
    res.send(err);
  }
});
 
app.listen(port, () => {
  console.log("Application running on port " + port);
});
 

Let’s say we’re trying to build a basic continuous integration setup for this project (we’ll leave deployment aside for the moment) in order to ensure that our code passes checks before merging it into master. We might want to do a two things:

  1. Lint the code.
  2. Run some tests.

Let’s build a pipeline that does each of those jobs by running scripts inside our package.json file:

.circleci/config.yml
version: 2.1
executors:
  app-executor:
    docker:
      - image: cimg/node:15.2.0
jobs:
  lint:
    executor: app-executor
    steps:
      - checkout
      - run:
          name: Install and lint project
          command: |
            npm install
            npm run lint
  test:
    executor: app-executor
    steps:
      - checkout
      - run:
          name: Run tests
          command: |
            npm install
            npm run test
workflows:
  lint_and_test_before_merge:
    jobs:
      - lint
      - test:
          requires:
            - lint

Both jobs run npm install in order to download our dependencies. This isn’t ideal, because the dependencies aren’t actually changing and our pipeline is doing extra work. These installs will also take more time as the project grows in size, both because our dependency list will grow, and because we will have more jobs re-fetching the dependencies.

The simplest way to boost the speed of our pipeline is to implement CircleCI’s caching mechanism.

Creating the cache

Let’s set up the cache. We specify the creation of a cache as a separate save_cache step within the job that installs our dependencies. The step requires two fields: the paths and key fields.

.circleci/config.yml
version: 2.1
executors:
  app-executor:
    docker:
      - image: cimg/node:15.2.0
jobs:
  lint:
    executor: app-executor
    steps:
      - checkout
      - run:
          name: Install and lint project
          command: |
            npm install
            npm run lint
      - save_cache:
          paths:
            - node_modules
          key: app-{{ checksum "package.json" }}
workflows:
  lint_and_save_cache:
    jobs:
      - lint

The paths key tells the cache what folders and files to save. In our case, we’re only caching our dependencies.

The key field is a user-defined string that points to the cache. We create a unique hash of our package.json file with the checksum command, and prepend it with the characters “-app” string. This hash will be unique to our current package.json file.

Using the cache

Now, let’s use that cache in another step.

We can use the restore_cache step in our later jobs to fetch those dependencies, since our package.json doesn’t change and the checksum will be the same.

.circleci/config.yml
version: 2.1
executors:
  app-executor:
    docker:
      - image: cimg/node:15.2.0
jobs:
  lint:
    executor: app-executor
    steps:
      - checkout
      - run:
          name: Install and lint project
          command: |
            npm install
            npm run lint
      - save_cache:
          paths:
            - node_modules
          key: app-{{ checksum "package.json" }}
  test:
    executor: app-executor
    steps:
      - checkout
      - restore_cache:
          keys:
            - app-{{ checksum "package.json" }}
            - app-
      - run:
          name: Run tests
          command: |
            npm run test
workflows:
  lint_and_test_before_merge:
    jobs:
      - lint
      - test:
          requires:
            - lint

We pass an array of keys to the step, and it looks sequentially for hashes that match. When it finds one, it’ll load those depedencies.

The second value in the array is known as a “fallback” and will allow CircleCI to load part of our cache. For instance, if our package.json has changed slightly, we won’t have to redownload everything.

You’ll see a message like this inside your workflow when the cache is created:

Creating cache archive...
Uploading cache archive...
Stored Cache to app-yircCA87nBHtoilUaSxwuZsPVx4OSrUcmpAoE45tPlo=
  * /home/circleci/project/node_modules

And a message like this when the cache is restored:

Found a cache from build 7 at app-yircCA87nBHtoilUaSxwuZsPVx4OSrUcmpAoE45tPlo=
Size: 5.0 MiB
Cached paths:
  * /home/circleci/project/node_modules
 
Downloading cache archive...
Validating cache...
 
Unarchiving cache...

Notice how the hashes of the cache matches. Now, every time you want to add another step to your workflow, you can just reference the cache created earlier and you’ll only have to download the dependencies once.

Using orbs

This seems really tedious—and it is! There’s a way of replacing all of this functionality with a feature that CircleCI has called “orbs.”

Orbs are basically blocks of configuration that give us a shortcut to writing out commonly used commands. Nearly every NodeJS developer wants to be able to install their project and cache the dependencies. Instead of manually writing all of this out every time, we can just reference the NodeJS orb in the top of our configuration file. That gives us access to special steps inside of our jobs.

Here’s the same functionality re-written with Orbs:

.circleci/config.yml
version: 2.1
orbs:
  node: circleci/node@4.5.1
jobs:
  start:
    executor: node/default
    steps:
      - checkout
      - node/install-packages
      - run: |
          npm run lint
workflows:
  start_app:
    jobs:
      - start

Notice that we specify we want to use the orb on lines 2 and 3, and we pass in the node/default as our executor. Then, inside of the steps, we have access to a special step provided by that orb: the node/install-packages step. This step will take care of all of our caching for us.

Why did we go through the pain of writing out this whole thing manually? Because it’s important to understand what’s happening behind the scenes when we use the Orb! And now you know.