Developer Guide

This guide is intended for people who want to work on Spack itself. If you just want to develop packages, see the Packaging Guide.

It is assumed that you’ve read the Basic Usage and Packaging Guide sections, and that you’re familiar with the concepts discussed there. If you’re not, we recommend reading those first.

Overview

Spack is designed with three separate roles in mind:

  1. Users, who need to install software without knowing all the details about how it is built.
  2. Packagers who know how a particular software package is built and encode this information in package files.
  3. Developers who work on Spack, add new features, and try to make the jobs of packagers and users easier.

Users could be end users installing software in their home directory, or administrators installing software to a shared directory on a shared machine. Packagers could be administrators who want to automate software builds, or application developers who want to make their software more accessible to users.

As you might expect, there are many types of users with different levels of sophistication, and Spack is designed to accommodate both simple and complex use cases for packages. A user who only knows that he needs a certain package should be able to type something simple, like spack install <package name>, and get the package that he wants. If a user wants to ask for a specific version, use particular compilers, or build several versions with different configurations, then that should be possible with a minimal amount of additional specification.

This gets us to the two key concepts in Spack’s software design:

  1. Specs: expressions for describing builds of software, and
  2. Packages: Python modules that build software according to a spec.

A package is a template for building particular software, and a spec as a descriptor for one or more instances of that template. Users express the configuration they want using a spec, and a package turns the spec into a complete build.

The obvious difficulty with this design is that users under-specify what they want. To build a software package, the package object needs a complete specification. In Spack, if a spec describes only one instance of a package, then we say it is concrete. If a spec could describes many instances, (i.e. it is under-specified in one way or another), then we say it is abstract.

Spack’s job is to take an abstract spec from the user, find a concrete spec that satisfies the constraints, and hand the task of building the software off to the package object. The rest of this document describes all the pieces that come together to make that happen.

Directory Structure

So that you can familiarize yourself with the project, we’ll start with a high level view of Spack’s directory structure:

spack/                  <- installation root
   bin/
      spack             <- main spack executable

   etc/
      spack/            <- Spack config files.
                           Can be overridden by files in ~/.spack.

   var/
      spack/            <- build & stage directories
          repos/            <- contains package repositories
             builtin/       <- pkg repository that comes with Spack
                repo.yaml   <- descriptor for the builtin repository
                packages/   <- directories under here contain packages
          cache/        <- saves resources downloaded during installs

   opt/
      spack/            <- packages are installed here

   lib/
      spack/
         docs/          <- source for this documentation
         env/           <- compiler wrappers for build environment

         external/      <- external libs included in Spack distro
         llnl/          <- some general-use libraries

         spack/         <- spack module; contains Python code
            cmd/        <- each file in here is a spack subcommand
            compilers/  <- compiler description files
            test/       <- unit test modules
            util/       <- common code

Spack is designed so that it could live within a standard UNIX directory hierarchy, so lib, var, and opt all contain a spack subdirectory in case Spack is installed alongside other software. Most of the interesting parts of Spack live in lib/spack.

Spack has one directory layout and there is no install process. Most Python programs don’t look like this (they use distutils, setup.py, etc.) but we wanted to make Spack very easy to use. The simple layout spares users from the need to install Spack into a Python environment. Many users don’t have write access to a Python installation, and installing an entire new instance of Python to bootstrap Spack would be very complicated. Users should not have to install install a big, complicated package to use the thing that’s supposed to spare them from the details of big, complicated packages. The end result is that Spack works out of the box: clone it and add bin to your PATH and you’re ready to go.

Code Structure

This section gives an overview of the various Python modules in Spack, grouped by functionality.

Build environment

spack.stage
Handles creating temporary directories for builds.
spack.compilation
This contains utility functions used by the compiler wrapper script, cc.
spack.directory_layout
Classes that control the way an installation directory is laid out. Create more implementations of this to change the hierarchy and naming scheme in $spack_prefix/opt

Spack Subcommands

spack.cmd
Each module in this package implements a Spack subcommand. See writing commands for details.

Unit tests

spack.test
Implements Spack’s test suite. Add a module and put its name in the test suite in __init__.py to add more unit tests.
spack.test.mock_packages
This is a fake package hierarchy used to mock up packages for Spack’s test suite.

Other Modules

spack.url
URL parsing, for deducing names and versions of packages from tarball URLs.
spack.error
SpackError, the base class for Spack’s exception hierarchy.
llnl.util.tty
Basic output functions for all of the messages Spack writes to the terminal.
llnl.util.tty.color
Implements a color formatting syntax used by spack.tty.
llnl.util
In this package are a number of utility modules for the rest of Spack.

Spec objects

Package objects

Most spack commands look something like this:

  1. Parse an abstract spec (or specs) from the command line,
  2. Normalize the spec based on information in package files,
  3. Concretize the spec according to some customizable policies,
  4. Instantiate a package based on the spec, and
  5. Call methods (e.g., install()) on the package object.

The information in Package files is used at all stages in this process.

Conceptually, packages are overloaded. They contain:

Stage objects

Writing commands

Adding a new command to Spack is easy. Simply add a <name>.py file to lib/spack/spack/cmd/, where <name> is the name of the subcommand. At the bare minimum, two functions are required in this file:

setup_parser()

Unless your command doesn’t accept any arguments, a setup_parser() function is required to define what arguments and flags your command takes. See the Argparse documentation for more details on how to add arguments.

Some commands have a set of subcommands, like spack compiler find or spack module refresh. You can add subparsers to your parser to handle this. Check out spack edit --command compiler for an example of this.

A lot of commands take the same arguments and flags. These arguments should be defined in lib/spack/spack/cmd/common/arguments.py so that they don’t need to be redefined in multiple commands.

<name>()

In order to run your command, Spack searches for a function with the same name as your command in <name>.py. This is the main method for your command, and can call other helper methods to handle common tasks.

Remember, before adding a new command, think to yourself whether or not this new command is actually necessary. Sometimes, the functionality you desire can be added to an existing command. Also remember to add unit tests for your command. If it isn’t used very frequently, changes to the rest of Spack can cause your command to break without sufficient unit tests to prevent this from happening.

Whenever you add/remove/rename a command or flags for an existing command, make sure to update Spack’s Bash tab completion script.

Unit tests

Unit testing

Developer commands

spack doc

spack test

spack python

spack python is a command that lets you import and debug things as if you were in a Spack interactive shell. Without any arguments, it is similar to a normal interactive Python shell, except you can import spack and any other Spack modules:

$ spack python
Spack version 0.10.0
Python 2.7.13, Linux x86_64
>>> from spack.version import Version
>>> a = Version('1.2.3')
>>> b = Version('1_2_3')
>>> a == b
True
>>> c = Version('1.2.3b')
>>> c > a
True
>>>

You can also run a single command:

$ spack python -c 'import distro; distro.linux_distribution()'
('Fedora', '25', 'Workstation Edition')

or a file:

$ spack python ~/test_fetching.py

just like you would with the normal python command.

spack url

A package containing a single URL can be used to download several different versions of the package. If you’ve ever wondered how this works, all of the magic is in spack.url. This module contains methods for extracting the name and version of a package from its URL. The name is used by spack create to guess the name of the package. By determining the version from the URL, Spack can replace it with other versions to determine where to download them from.

The regular expressions in parse_name_offset and parse_version_offset are used to extract the name and version, but they aren’t perfect. In order to debug Spack’s URL parsing support, the spack url command can be used.

spack url parse

If you need to debug a single URL, you can use the following command:

$ spack url parse http://cache.ruby-lang.org/pub/ruby/2.2/ruby-2.2.0.tar.gz
==> Parsing URL: http://cache.ruby-lang.org/pub/ruby/2.2/ruby-2.2.0.tar.gz

==> Matched version regex  0: r'^[a-zA-Z+._-]+[._-]v?(\\d[\\d._-]*)$'
==> Matched  name   regex  7: r'^([A-Za-z\\d+\\._-]+)$'

==> Detected:
    http://cache.ruby-lang.org/pub/ruby/2.2/ruby-2.2.0.tar.gz
                                            ---- ~~~~~
    name:    ruby
    version: 2.2.0

==> Substituting version 9.9.9b:
    http://cache.ruby-lang.org/pub/ruby/2.2/ruby-9.9.9b.tar.gz
                                            ---- ~~~~~~

You’ll notice that the name and version of this URL are correctly detected, and you can even see which regular expressions it was matched to. However, you’ll notice that when it substitutes the version number in, it doesn’t replace the 2.2 with 9.9 where we would expect 9.9.9b to live. This particular package may require a list_url or url_for_version function.

This command also accepts a --spider flag. If provided, Spack searches for other versions of the package and prints the matching URLs.

spack url list

This command lists every URL in every package in Spack. If given the --color and --extrapolation flags, it also colors the part of the string that it detected to be the name and version. The --incorrect-name and --incorrect-version flags can be used to print URLs that were not being parsed correctly.

spack url summary

This command attempts to parse every URL for every package in Spack and prints a summary of how many of them are being correctly parsed. It also prints a histogram showing which regular expressions are being matched and how frequently:

$ spack url summary
==> Generating a summary of URL parsing in Spack...

    Total URLs found:          1739
    Names correctly parsed:    1545/1739 (88.84%)
    Versions correctly parsed: 1603/1739 (92.18%)

==> Statistics on name regular expressions:

    Index  Count  Regular Expression
      0:    323   r'github\\.com/[^/]+/([^/]+)'
      1:      3   r'gitlab[^/]+/[^/]+/([^/]+)'
      2:     16   r'bitbucket\\.org/[^/]+/([^/]+)'
      3:    270   r'pypi\\.(?:python\\.org|io)/packages/source/[A-Za-z\\d]/([^/]+)'
      5:      2   r'\\?package=([A-Za-z\\d+-]+)'
      7:   1124   r'^([A-Za-z\\d+\\._-]+)$'

==> Statistics on version regular expressions:

    Index  Count  Regular Expression
      0:   1257   r'^[a-zA-Z+._-]+[._-]v?(\\d[\\d._-]*)$'
      1:    193   r'^v?(\\d[\\d._-]*)$'
      2:     15   r'^[a-zA-Z+]*(\\d[\\da-zA-Z]*)$'
      3:      7   r'^[a-zA-Z+-]*(\\d[\\da-zA-Z-]*)$'
      4:     32   r'^[a-zA-Z+_]*(\\d[\\da-zA-Z_]*)$'
      5:     24   r'^[a-zA-Z+.]*(\\d[\\da-zA-Z.]*)$'
      6:    108   r'^[a-zA-Z\\d+-]+-v?(\\d[\\da-zA-Z.]*)$'
      7:      1   r'^[a-zA-Z\\d+-]+-v?(\\d[\\da-zA-Z_]*)$'
      8:      9   r'^[a-zA-Z\\d+_]+_v?(\\d[\\da-zA-Z.]*)$'
     10:     14   r'^(?:[a-zA-Z\\d+-]+-)?v?(\\d[\\da-zA-Z.-]*)$'
     11:      2   r'^[a-zA-Z+]+v?(\\d[\\da-zA-Z.-]*)$'
     12:      3   r'^[a-zA-Z\\d+_]+-v?(\\d[\\da-zA-Z.]*)$'
     13:      7   r'^[a-zA-Z\\d+.]+_v?(\\d[\\da-zA-Z.-]*)$'
     16:      2   r'^[a-zA-Z+-]+(\\d[\\da-zA-Z._]*)$'
     17:      1   r'bzr(\\d[\\da-zA-Z._-]*)$'
     18:      2   r'github\\.com/[^/]+/[^/]+/releases/download/[a-zA-Z+._-]*v?(\\d[\\da-zA-Z._-]*)/'
     19:      2   r'\\?ref=[a-zA-Z+._-]*v?(\\d[\\da-zA-Z._-]*)$'
     22:      2   r'\\?package=[a-zA-Z\\d+-]+&get=[a-zA-Z\\d+-]+-v?(\\d[\\da-zA-Z.]*)$'

This command is essential for anyone adding or changing the regular expressions that parse names and versions. By running this command before and after the change, you can make sure that your regular expression fixes more packages than it breaks.

Profiling

Spack has some limited built-in support for profiling, and can report statistics using standard Python timing tools. To use this feature, supply --profile to Spack on the command line, before any subcommands.

spack --profile

spack --profile output looks like this:

$ spack --profile graph dyninst
o  dyninst
|\
| |\
| | |\
| | o |  cmake
| | |\ \
| | o | |  openssl
| | | | o  boost
| | | |/| 
| | |/| | 
| | o | |  zlib
| |  / /
| | o |  ncurses
| | o |  pkg-config
| |  /
o | |  libdwarf
|/ /
o |  elfutils
 /
o  bzip2
         1078759 function calls (1023779 primitive calls) in 2.152 seconds

   Ordered by: internal time
   List reduced from 1061 to 20 due to restriction <20>

...

The bottom of the output shows the top most time consuming functions, slowest on top. The profiling support is from Python’s built-in tool, cProfile.