Case study: using Zig build system to build a complex R and JavaScript project

14 October 2024

Overview

This is a case study in the use of the Zig build system to build a project consisting of 15 R packages and substantial JavaScript into an isolated runtime environment suitable for development and testing under Linux.

The project is RCloud, which currently uses a collection of Bash, R and JavaScript scripts to build itself from source. It relies on public binary package repositories to include its R and JS dependencies.

The Zig solution developed has these parts:

a code generator that generates a build.zig file
an asset fetcher that downloads binary assets and confirms hashes
an R-specific library which imports package repositories to generate and validate a repeatable build plan for a local collection of packages and their dependencies

Implementation

The Zig build of RCloud uses two stages:

zig build update to generate a build.zig file which describes the steps necessary to build all 15 packages and their current 64 transitive dependencies
zig build (install) to download dependencies and build all packages and the JavaScript bundles necessary, resulting in all necessary artifacts being placed in zig-out. This stage imports the build.zig file generated in the previous step.

A JSON configuration file is used by both steps. The file declares the public R repositories which should be searched for dependencies. For RCloud, those are CRAN and RForge. The file also lists every dependent source tarball by package name, along with its hash. The first time zig build update is run, this asset list can be empty, and it is filled by the code generator.

Subsequent runs of zig build update will query the latest versions from configured repositories, and update the dependent packages to point to their latest versions.

Initially, zig build update sets the hashes to empty strings. The main build step is responsible for fetching tarballs and verifying hashes. When it encounters an empty string, it issues a warning and updates the hash. If the hash field is non-empty, it must match the computed hash of the downloaded asset, or else the build will fail.

Care is taken to enable correct incremental builds by leveraging Zig build's cache. For example, downloaded assets are not re-downloaded. Local R packages are only rebuilt if any of their source files change. Same for the JavaScript bundles.

Outer build script

(Some of the code samples have been edited for this article. Links to the code are at the end of this article.)

The top level build file imports build tools from an external package, r-build-zig, and imports a generated build.zig file. (For the initial bootstrap, a minimal initial generated/build.zig file must be provided.)

const std = @import("std");
const r_build_zig = @import("r-build-zig");
const generated_build = @import("generated/build.zig");

The main build entry point declares the overall organisation of the build. There are four major pieces: fetching the assets and building, building the htdocs (JavaScript), generating the R build script given a set of local directories and config.json file, and making a distribution tarball.

pub fn build(b: *std.Build) !void {
    const target = b.standardTargetOptions(.{});
    const optimize = b.standardOptimizeOption(.{});
    const config_path = "config.json";

    const update_step = b.step("update", "Generate R package build files");
    const tarball_step = b.step("dist", "Make a source archive");

    // declare build install rules
    try fetch_assets_and_build(b, config_path, target, optimize);

    // declare rules for htdocs
    build_htdocs(b);

    // declare step: update
    try generate_build_script(
        b,
        config_path,
        &.{
            "packages",
            "rcloud.client",
            "rcloud.packages",
            "rcloud.support",
        },
        update_step,
        target,
        optimize,
    );

    // declare step: dist
    try make_tarball(b, tarball_step);
}

The main build, which relies on a generated build.zig file, is driven by the following function. It uses the build tool fetch-assets provided by r-build-zig to fetch and cache all third-party sources, in parallel and with hash checksum verification. Then it calls into the generated build.zig file, supplying the assets directory as an argument, to generate the build rules to build everything.

fn fetch_assets_and_build(
    b: *Build,
    config_path: []const u8,
    target: ResolvedTarget,
    optimize: OptimizeMode,
) !void {
    // get the fetch-assets tool
    const exe = b.dependency("r-build-zig", .{
        .target = target,
        .optimize = optimize,
    }).artifact("fetch-assets");

    // run fetch-assets tool
    const step = b.addRunArtifact(exe);
    _ = step.addFileArg(b.path(config_path));
    const out_dir = step.addOutputDirectoryArg("assets");

    // supply assets dir to generated build script
    try generated_build.build(b, out_dir);
}

Generating build.zig

This next section declares the steps required to run the external tool which generates another build.zig file, and copies it into the source directory. It is triggered by zig build update. The tool accepts a variable number of R package source code directories, which it searches recursively to discover package definitions. It uses those definitions to discover internal and external dependencies. The outer script copies the generated build.zig artifact to a known location in the source tree.

fn generate_build_script(
    b: *Build,
    config_path: []const u8,
    relative_source_package_paths: []const []const u8,
    update_step: *Step,
    target: ResolvedTarget,
    optimize: OptimizeMode,
) !void {
    const exe = b.dependency("r-build-zig", .{
        .target = target,
        .optimize = optimize,
    }).artifact("generate-build");

    const step = b.addRunArtifact(exe);
    _ = step.addArg(config_path);

    const out_dir = step.addOutputDirectoryArg("deps");
    for (relative_source_package_paths) |path| {
        _ = step.addArg(path);
    }

    // copy the generated build.zig file to generated directory
    const uf = b.addUpdateSourceFiles();
    uf.addCopyFileToSource(out_dir.path(b, "build.zig"),
                           "generated/build.zig");

    update_step.dependOn(&uf.step);
}

Generated build rules

The following segment shows the Zig build code generated by the generate-build tool for a particular package. It will invoke R CMD INSTALL, supplying a library directory where R can find this package's dependencies, and place the built artifacts. It captures stdout and stderr, saving the former and discarding the latter.

It then declares explicit dependencies on prior steps which will have built its required packages. These requirements are determined by parsing the repository metadata about the package.

The final steps copy the artifacts to the output directories. The addFileInput line is discussed in the Caching section.

const stringr = b.addSystemCommand(&.{"R"});
stringr.addArgs(&.{"CMD", "INSTALL", "-l"});
_ = stringr.addDirectoryArg(libdir.getDirectory());
_ = stringr.addFileArg(asset_dir.path(b, "stringr_1.5.1.tar.gz"));
stringr.step.name = "stringr";

const stringr_out = stringr.captureStdOut();
_ = stringr.captureStdErr();
stringr.step.dependOn(&cli.step);
stringr.step.dependOn(&glue.step);
stringr.step.dependOn(&lifecycle.step);
stringr.step.dependOn(&magrittr.step);
stringr.step.dependOn(&rlang.step);
stringr.step.dependOn(&stringi.step);
stringr.step.dependOn(&vctrs.step);

// see "Caching" section
stringr.addFileInput(b.path("config.json"));

const stringr_install = b.addInstallDirectory(.{
    .source_dir = libdir.getDirectory().path(b, "stringr"),
    .install_dir = .{ .custom = "lib" },
    .install_subdir = "stringr",
});

stringr_install.step.dependOn(&stringr.step);
b.getInstallStep().dependOn(&stringr_install.step);
b.getInstallStep().dependOn(&b.addInstallFileWithDir(
    stringr_out,
    .{ .custom = "logs" },
    "stringr.log",
).step);

Building JavaScript

The build script drives external tools to install npm requirements and then invoke a binary tool installed in node_modules. For caching purposes, note the addFileInput calls as well as expectExitCode.

Since both of these tools do not depend on their command line arguments (they have none, actually), we need to use addFileInput to tell Zig that they need not be repeated unless those files have changed.

Both tools generate artifacts inside the source directory, so we first copy our source to the cache and run the tools from the cache directory, to keep our source directories clean.

The grunt tool generates binary artifacts inside our source directories, so we add an installation step which copies those directories to the install location, but only after the grunt step.

fn build_htdocs(b: *Build) void {
    const wf = b.addWriteFiles();

    // copy htdocs source
    _ = wf.addCopyDirectory(b.path("htdocs"), "htdocs", .{});

    // install js requirements
    const npm_ci = b.addSystemCommand(&.{ "npm", "ci" });
    npm_ci.setCwd(wf.getDirectory());
    npm_ci.addFileInput(wf.addCopyFile(b.path("package.json"), "package.json"));
    npm_ci.addFileInput(wf.addCopyFile(b.path("package-lock.json"), "package-lock.json"));
    npm_ci.expectExitCode(0);

    // run grunt
    const grunt = b.addSystemCommand(&.{"node_modules/grunt-cli/bin/grunt"});
    grunt.setCwd(wf.getDirectory());
    grunt.addFileInput(wf.addCopyFile(b.path("Gruntfile.js"), "Gruntfile.js"));
    grunt.addFileInput(wf.addCopyFile(b.path("VERSION"), "VERSION"));
    grunt.addFileInput(wf.getDirectory().path(b, "node_modules/grunt-cli/bin/grunt"));
    grunt.expectExitCode(0);

    // which depends on npm_ci
    grunt.step.dependOn(&npm_ci.step);

    // add an install step for post-grunt htdocs
    const htdocs_install = b.addInstallDirectory(.{
        .source_dir = wf.getDirectory().path(b, "htdocs"),
        .install_dir = .prefix,
        .install_subdir = "htdocs",
    });
    htdocs_install.step.dependOn(&grunt.step);

    // install built htdocs files
    b.getInstallStep().dependOn(&htdocs_install.step);
}

Caching

Caching in the Zig build system is an important part of its operation. It saves development time by only building artifacts when their dependencies have changed. This introduces special considerations when building atypical artifacts.

Zig uses sensible heuristics to determine when an artifact needs to be rebuilt, and it provides an API for the developer to add additional dependencies. The main heuristic when using external tools is that given identical inputs, a tool will produce identical outputs. Therefore, if the inputs haven't changed and the output is still in the cache, there is no need to run the tool again. Zig provides options to override this behaviour, for example to specify that a tool should always be run, regardless of its cache status.

Here we see that the step is defined to run an exe, and it accepts a single addFileArg argument, which is our config file. Zig will hash the contents of the file and skip running this step on subsequent builds if the hash is unchanged.

const step = b.addRunArtifact(exe);
_ = step.addFileArg(b.path(config_path));
const out_dir = step.addOutputDirectoryArg("assets");

This heuristic is good for simple cases. However, if a tool should also be re-run if some external files (not provided as arguments) have changed, Zig provides the addFileInput function to identify those files. From the generated build code:

const stringr = b.addSystemCommand(&.{"R"});
stringr.addArgs(&.{"CMD", "INSTALL", "-l"});
// ...
stringr.addFileInput(b.path("config.json"));

This will force the R CMD INSTALL tool to be run again if the contents of config.json are changed.

Limitations and future work

There are many limitations in the current work. For the build process, some of the build caching could be more precise. For example, editing the configuration file will cause all assets to be downloaded again and a rebuild of all packages. In practice, this is not severe because updates to the configuration file should only happen if dependencies are being added, deleted or updated; and in any of those cases, a complete system rebuild is advisable. However, it would be nice if a full rebuild was optional during development.

Version constraints with two-sided inequalities are not supported. While an analysis of the CRAN package repository reveals that there are no packages using two sided constraints, in a complex environment such as RCloud it is conceivable that some local packages may use >= constraints and others may use < constraints to avoid bugs introduced by new releases. Reconciling these to find a suitable package version is not currently supported by our system. While this does not limitation does not pose a practical problem at the time of this writing, it is conceivable that CRAN policies and practices may not be adopted by other R package repositories.

Initially, I was unable to efficiently use the Zig package manager (zig fetch) to manage R package tarballs, because Zig extracts the contents and discards the archive. Upon reflection, it may be possible to accommodate R's expectations and install dependencies from a source directory. If this is possible, it would remove the need for a separate fetch-assets tool.

Wish lists

Zig wish list

Explicitly support binary assets in build.zig.zon files

Zig currently expects dependencies to be in specific formats. For example, it expects that a .tar.gz dependency will be a source tree tarball. After fetching, it untars the source and hashes its contents. The original tarball is not available via the build system.

This conflicts with R CMD INSTALL's expectation of a tarball as an input argument. Passing the extracted source tree to R is also not viable, as R performs an in-source build, leaving artifacts inside the directory. Some package builds are not idempotent, so this causes a failure on subsequent builds. We can mitigate this by copying the extracted directory in a separate step, or even re-archiving it to recover a .tar.gz file, but this adds additional complexity.

Of course, this is a limitation of R, not Zig, but if there were a way to specify to Zig that a particular dependency should be considered an opaque binary, it would make these atypical uses more convenient.

Zig's build cache and build.zig.zon file could be a very useful binary asset manager, and it would obviate the need to supply tools like fetch-assets.

Make zig build system more extensible

Currently, Zig build scripts can import the build scripts of other packages only if they are fully self-contained. For a single package, a build script can import dependencies from other packages. But doing so makes that build script impossible to import from a different package.

The two step build process along with code generation I used in this case study was required due to this limitation. Ideally, the top level build script should have been able to import the external library providing the R-specific dependency resolution (and its dependencies), and be able to apply the Zig build rules to the current Build object during a one step build process.

R wish list

Expose private functions for dependency information

Several of the internals of R's build planning are in private functions. It would be nice if they could be made part of the public API. The fact that there are several third-party attempts to provide reproducible build systems for R suggests there may be a need for a more complete API.

Pre-flight build plans before starting

Currently, when installing a package at the user's request via install.packages, R develops a build plan which determines the correct installation order of the package and its available transitive dependencies. However, it fails to account for three important sources of failures, two of which are readily correctable.

The package may depend on a package not available in the current set of known repositories.
The package may express a constrained requirement, such as lib >= 1.2.1, which cannot be satisfied given the available packages.
The package may depend on a system requirement that is not present.

The third failure is difficult to detect ahead of actually trying to build the package, so I leave that aside for the moment.

But the first two are knowable, and the library I developed demonstrates one way to correctly identify these failures as part of its build planning process. It would be nice if this functionality were a part of base R.

Additional background

These sections go a bit deeper for readers less familiar with Zig or R who wish to learn more.

Background on the Zig implementation

Zig build system

Zig is currently under active development, and this work was performed using pre-release development versions of 0.14.0.

Like most build systems, Zig's is declarative. However, the full power of the imperative language is available, with some limitations. The user provides a single entry point which receives a pointer to a Build object, and the user adds build targets, rules and dependencies via the extensive Build API.

Zig's build system makes extensive use of caching to speed up incremental builds and the developer's edit-compile-run experience. Understanding this caching in an atypical case such as this one was a challenge but proved crucial to making things work correctly.

Zig package manager

Zig also provides a package manager, whereby a developer can specify dependencies on third-party packages, and have those dependencies be automatically fetched, verified against a checksum, and built.

At first, this seemed like it would be suitable to fetch external R package dependencies, but ultimately this did not work due a mismatch between Zig's expectation of a package and R's expectation. So I implemented a separate asset declaration and fetching tool. Future work could take another look at this.

Background on R packages

At the outset, I should emphasise that RCloud's requirements are not typical of R package development. Within open source R projects, it is practically unique in the simultaneous development and delivery of 15 R packages in a single repository. R provides simple and practical tools for end-users to install a given package and all of its dependencies from a set of configured package repositories, with a single command. The tools R provides are sufficient for the vast majority of use cases.

Beyond the need to build a large collection of packages and their dependencies from source, RCloud has additional requirements, including supply chain security and repeatable builds, that go beyond the typical package developer's requirements. This motivates the development of a more general set of build tools.

Base R tools

R provides a command line tool, activated by R CMD INSTALL, which can build and install a single R package from a tarball or local directory, to a specified output location.

This tool requires that any dependencies needed by the package being built be already installed in the output location. Therefore, given a collection of packages to install, the developer must build them and all of their dependencies in the correct order.

Transitive dependencies

Base R does not provide public functions to assist in discovering transitive dependencies or the installation order, but there are private functions defined for this purpose.

This means that given a single package's source code, base R provides no public way for the user to install that package successfully, unless it has no third-party dependencies. And further, it provides no public way for the user to discover the full set of packages needed to successfully build the package in question.

Due to R's dynamic nature, we can access the private functions we need to fulfil some of the requirements above, but not all.

Ultimately, I chose to implement a parser for R package and repository metadata in Zig, and to calculate transitive dependencies that merge and satisfy version constraints (e.g. lib >= 1.2.1), which I describe more in the next section.

I won't cover the details of that work in this case study, but it is crucial to generating the correct build rules to build R packages in the correct order.

Version constraints

Like most package manager ecosystems, R packages can declare their dependencies on external packages with specific constraints on the version. This is used to ensure the depended package has the necessary functionality or bug fix. Version constraints are loosely specified in the official R documentation, and it is stated that any operator may be used.

However, the largest package repository, CRAN, enforces certain policies which results in nearly all packages published in their repository using greater-than-or-equal constraints, such as lib >= 1.2.1. Other repositories have different policies, but they have fewer packages than CRAN.

Version constraints are not enforced as part of a build plan. Instead, they are checked by attempting to install a package and its dependencies, either as part of the installation process or as part of R CMD check (a required step before a package is accepted by repositories like CRAN). Both steps take a significant amount of time, because a package with many dependencies may successfully build or check all of its dependencies before failing on the last one.

My desk research hasn't yet found any discussion of the challenges of satisfying version constraints while building a complex collection of packages. In RCloud's case, several of its dependencies declare different version constraints on transitive dependencies, and these need to be merged. For example, one may declare lib > 1.0.1 and another my declare lib >= 1.2.3. The first need to be merged into the second's constraint.

Luckily, I did not encounter any contradictory constraints, nor did I encounter any two-sided constraints (e.g. lib > 1.0 and lib < 2.0). But I know of no mechanism that would correctly handle this case within the R ecosystem.

In practice, given the policies enforced by large R package repositories such as CRAN, the lack of support for two-sided constraints is not a problem. But there do exist smaller package repositories, and they may have or develop different policies than CRAN.

System requirements

Many R packages depend on certain system libraries, such as openssl or icu, and it is up to the developer to identify and install those tools prior to building. Typically, that's one role of a configure script, to verify or complain to the developer if there are missing system requirements.

The alternative is to start the build process, wait for it to fail, and attempt to identify from the error what library is missing, install it, and repeat. This is the process I went through to initially build this project.

In practice, once I discovered the initial set of system requirements, I created a shell.nix and flake.nix to easily reproduce those requirements on other hosts.

Contact: @mocom@mastodon.social. Published using C-c C-e P p. If you use this content commercially, kindly make an appropriate donation to Zig Software Foundation in my name.