Why dep?

There are a lot of software build automation tools out there. Why would you use Dep instead of one of the others?

The Dep Difference

Simply put, it's because of the differences between Dep and other tools.

Which is rather a lot of comparisons to perform, but let us start with one: Dep vs GNU Make.

Dep vs GNU Make

GNU Make is the most popular build tool out there, and with good reason: it is packed with useful features. But it does have some flaws, and the difference between GNU Make and Dep starts with those flaws. They are:

  1. GNU Make needs the build graph to be given to it in a Makefile. It has no way to determine the build graph on its own
  2. It does not easily handle build rules that produce multiple output files
  3. It does not compare output files for changes
  4. It does not handle subprojects very well (Peter Miller's paper "Recursive Make Considered Harmful" details why)
  5. It does not allow filenames to contain whitespace
  6. Parallelism is rudimentary
  7. It does not do any prioritization of tasks beyond topological sorting
  8. It does not cache any data
  9. It puts too much emphasis on manipulating builtin pattern rules by setting variables
  10. It starts lots of shell processes that just run a single command. It should just run the command
  11. Its language has some unnecessary complexities (the dreaded Tab character, .INTERMEDIATE / .SECONDARY / .PRECIOUS, multiple syntaxes for rules)
  12. Some features are pretty much dead ('::' rules, VPATH, recursively expanded variables)
  13. GNU Make is good at C and C++, but not so good at other programming languages.

With some effort, most of these issues can be worked around, entirely within GNU Make. But working around all of them, at once, reliably, is very hard indeed.

Dep currently addresses most of these issues. We will come back to how in a little while.

Dep vs other new build tools

A lot of new build tools have sprung up recently. Some of them have excellent features, such as:

Amongst the new build tools, another set of features quite often pop up:

We noted that GNU Make has some "unnecessary complexities"; these features of new tools are basically in the same category.

Moving on.

Sometimes, the new tools just get basic things wrong:

Let's get back to basics: When I am compiling a program, I want to run some commands. Compiler commands, to be sure, but otherwise, just commands. What sort of tool helps do that?

Dep vs shell scripts

What sort of tool is excellent at running a bunch of other programs? That would be a shell script.

So why not just use shell scripts?

Well, you can. But when it comes to building software, most build tools have a few upsides when compared to shell scripts:

  1. Build tools can perform incremental builds; that is, they can skip commands that don't need to be run
  2. Build tools can be directed to build just a particular target
  3. Build tools can run commands in dependency order
  4. Ideally, build tools can run some commands in parallel

These apply to build tools in general, and they also apply to Dep. But they don't necessarily apply to the tool's language. In Dep's case, a DepFile is little more than a list of commands.

That is, a DepFile is little more than a shell script.

Going back to GNU Make, we noted earlier that GNU Make needs the build graph to be given to it, explicity, in a Makefile; it has no way to determine the build graph on its own.

Dep has a way: it determines the build graph by analysing the commands and source files.

2: The Dep way

Going back to GNU Make and other new build tools, they tend to provide a variety of ways of specifying build rules, where a build rule is a set of inputs, outputs and commands.

Dep splits the process in two:

This is probably Dep's most novel feature.

A couple of examples might be in order:

Example 1: A compile command

Let's say you wish to set some flags for a particular compile command. In GNU Make, you have some options. You might:

  1. Write out the rule in full:
    db.o: db.c
            cc -c -Os -I /usr/include/postgres -o db.o db.c
  2. Use automatic variables to shorten the rule a bit:
    db.o: db.c
            cc -c -Os -I /usr/include/postgres -o $@ $<
  3. If there are a lot of similar rules, you might write your own pattern rule:
    %.o: %.c
            cc -c -Os -I /usr/include/postgres $< -o $@
  4. Or you might use a built-in pattern rule, and tweak the rule by setting some FLAGS values:
    CFLAGS := -Os
    CPPFLAGS := -I /usr/include/postgres

    And here is the built-in pattern rule that we are using:

    %.o: %.c
            $(CC) -c $(CFLAGS) $(CPPFLAGS) $< -o $@

With Dep, you can:

  1. Write out the command in full:
    cc -c -Os -I /usr/include/postgres -o db.o db.c

    In Dep, most often, a build rule is just a build command. But in order to work, Dep has to extract the input and output information from the command. It does so by using one of its own builtins, a syntax rule:

    syntax -mixed cc -g -f: -I:%includedir -W: -o:%out -c -l:%library -L:%libdir -.* %in

    This syntax rule describes the arguments of the cc command well enough for dep to obtain everything else it needs, which are the inputs and outputs of the rule. The "%out" keyword indicates that the -o option specifies an output, file, and "%in" shows that any non-option argument specifies an input file.

    (Two other keywords, %library and %libdir, indicate the options for link libraries and library folders.)

  2. If there are a lot of similar rules, you might write your own pattern rule:
    cc -c -Os -I /usr/include/postgres -o %.o %.c
  3. If you have a build command that does not clearly specify its inputs and outputs, (or is the only command of its kind, and would be simpler to just write out in full) you can add the inputs and outputs to the command itself:
    gcc -c -Os -I /usr/include/postgres -o db.o db.c %in dc.c %out db.o
  4. And if you have some other situation that is not solved by pattern rules, you can set and use your own variables:
    CFLAGS=-Os
    CPPFLAGS=-I /usr/include/postgres
    cc -c $CFLAGS $CPPFLAGS -o %.o %.c

    Although this technique is more relevant in the presence of loop and/or condition constructs:

    for platform in linux64 w64 {
      if $platform = linux64 {
        SO=so
      }
      else {
        SO=dll
      }
      cc -o libfoo.$SO foo.o bar.o
    }

    Some more notes:

That's fine for .c files, but it is also important that .h files are included in the build graph. With GNU Make and GCC, one can get the GCC compiler to output lists of header file dependencies, and get GNU Make to read them.

Dep takes a different approach: it reads the #include directives itself. It does so by using another of its builtins, a findincludes rule:

findincludes '#include[ \t]+"(.*)"[ \t]*\r?' *.c *.cpp *.h

Dep does this recursively for each header file found.

Also, instead of tabulating include files as extra source files for a target, Dep models them as what they are: include files for a source. Dep also does this in the presence of differing -I options. This all results in a smaller build graph that is faster to process.

Note that this technique does not take into account conditional compilation, e.g. a #include directive within a #if directive. This sometimes result in some rebuilds that aren't strictly necessary. But for a few reasons, this does not matter terribly much:

Let's quickly make a few comparisons between Dep and GNU Make, and see if Dep addresses GNU Make's issues:

Example 2: Multiple outputs and compared outputs

Let's say you have a command that is a bot more complicated: it produces two output files, one of which changes very rarely, and should be compared for changes before triggering recompiles. With the latest version of GNU make, you might use this rule:

fooparse.c fooparse.h &: fooparse.y
        bison -o fooparse.c --defines=fooparse.h $<

This constructs a single rule that produces two output files, thanks to GNU Make's new '&:' operator. However, fooparse.h changes rarely, and it would be good if it could be checked for differences before rebuilding any dependencies. GNU Make has no good solution for this (there are some partial solutions but they are all a bit messy.)

Here is the equivalent rule in Dep:

bison -o fooparse.c --defines=fooparse.h fooparse.y

Once again, dep uses a builtin syntax rule to determine the inputs and outputs:

syntax bison -W: -t -v -o:%out --defines=%cmpout %in

There is a new placeholder, %cmpout. This is like %out, except that the file is checked for differences before triggering a rebuild on its dependencies. Also, this last bit of information is also stored in .DepFileCache so that dep knows not to rebuild any fooparse.h dependencies on subsequent invocations.

But dep caches much more data, and uses this data for many purposes:

Let's quickly compare Dep and GNU Make again:

Example 3: Subprojects

In Make, if you want to maintain and update subprojects along with a main project, you either use recursive make, or you can construct Makefiles that span multiple directories (Peter Miller went into some detail as to how).

In Dep, you add an include statement to your DepFile:

include ../quux/DepFile

Dep determines the relationships between files in subdirectories (and sibling directories) by adjusting their filenames as appropriate. Before Dep runs a build command in another directory, it changes via chdir() to that directory.

In ../quux/DepFile, there might be a reference to a file called quux.h . In DepFile in the current folder (e.g. foo), there might be a reference to a file called called ../quux/quux.h . Dep takes into account the relative filenames from differing folders, and always adjusts them to filenames relative to the working directory. Despite having different filenames, they are the same file. The effect is as if Dep always uses fully qualified names.

Chapter 4 (Libraries and Folders) of the Tutorial goes into more detail.

Comparing Dep and GNU Make again, let us say that Dep handles subprojects relatively well (point 4).

Status Check

In three examples we have shown how Dep addresses seven of GNU Make's thirteen enumerated issues.

These are: 1 (build graph construction), 2 (multiple outputs), 3 (compared output), 4 (subprojects), 8 (cache), 9 (overuse of implicit builtin rules) and 12 (dead features)

We won't go deeply into the remaining issues, we will just summarize how Dep addresses them. They are:

Dep partly addresses one GNU Make issue:

And Dep does not currently addresses one GNU Make issue:

Where to from here?

Dep is a work-in-progress, and there is still much to do. In no particular order: