Prime Mover / Documentation / The core pmfile language

The core pmfile language

Page Contents

1. Introduction

This document describes, concisely but with luck clearly, the core language used to write pmfiles. It does not describe how to use pm or how to deploy it.

Prime Mover is written in Lua; in fact, pmfiles are Lua scripts. It is, however, not necessary to know how to write programs in Lua to use pm.

Nevertheless, sometimes it's useful to embed Lua statements in your pmfiles so that you can do advanced things, such as automatically picking appropriate source files based on the architecture that's being used.

Blocks of text in these boxes contain useful details that may be useful to people who already know Lua. If you don't, or don't care, then please ignore them.

2. Syntax

pmfiles are composed of a series of statements. Statements may be:

Statements do not have termination characters, and may be split over multiple lines; pm will determine whether statements extend across multiple lines by whether they make syntactic sense or not.

Blank lines are ignored, of course.

2.1. Comments

Comments in pm start at a double hyphen (--) and extend to the end of the line.

-- This is a comment.

You may also do block comments, which extend over multiple lines, and start with a --[[ and end with a --]].

--[[ This is a
block comment. --]]

2.2. Directives

Directives cause pm to do something immediately. There is currently only one supported directive:

2.2.1. include

Causes pm to read in another named pmfile. include takes a single argument, which must be a " or ' quoted string.

include "anotherfile.pm"
include 'lib/c.pm'

The argument must be a constant string. String expansion is not performed.

The file is read in and executed with the same global environment as the caller. This means that included files share globals, but as they are a separate chunk, they will have their own local variables.

2.3. Assignments

Assignments assign a value to a global property. The bulk of your pmfile will consist of assignments.

The syntax is simple:

name = value

You may assign any value to any name (that does not conflict with a reserved word).

Global properties are Lua global variables; entries in the global environment.

2.4. Names

Names are used across Prime Mover to refer to properties. They may contain any combination of letters, numbers and underscores, but may not start with a number. They are case sensitive. The following names are reserved and may not be used:

and       break     do        else      elseif
end       false     for       function  if
in        local     nil       not       or
repeat    return    then      true      until
while     io        string    table     os
posix     pm        EMPTY     PARENT    REDIRECT

In addition, all names beginning with a double underscore are reserved for internal use by Prime Mover and should not be used.

2.5. Values

Values in pm can be of these four types:

Numbers are also supported, but they are equivalent to strings and are only used in rare circumstances.

2.5.1. Strings

Strings are expressed as traditional sequences delimited with double or single quotation marks. The escape character is \, with \n and \t having the usual meaning.

"a string"
'a string'
"mustn't"
'mustn\'t'
"multiple\nlines"

2.5.2. Lists

Lists are comma-separated sequences of strings between {} characters. Lists may be empty, and a trailing comma may be left on.

{}
{"one", "two", "three"}
{"foo",}

The special value EMPTY may be used to represent a particular kind of empty list, used in string expansion.

The first entry of a list can also be one of the special terms PARENT and REDIRECT. These have meaning when doing property lookups.

Prime Mover lists are, obviously, Lua tables with only numeric keys.

2.5.3. Booleans

Booleans consists of the literal words true or false.

true
false

They are normally used to switch on or off features via local properties.

2.5.4. Rule instantiation

A rule instantiation consists of the name of an existing rule, plus a list of modifiers between {} characters.

rulename {
    foo = "1",
    bar = "2",
    "3",
    "4",
}

Unnamed modifiers are numbered from 1 and specify children. Named modifiers specify local properties if the name is in lower case and inherited properties if the name is in upper case. The order of the modifiers is irrelevant, and properties and children may be mixed, but children are also numbered in order of appearance.

Rule names refer to global properties that are themselves rule instantiations.

Rule instantiations are Lua functions that take a single parameter, a literal table containing modifiers to apply to the newly created rule.

3. Semantics

3.1. Overview

When Prime Mover's rule engine runs, it looks at the rule that the user specifies (which is default by default), and performs a depth first traversal of all rules it references via its children. At each stage it checks to see if the rule needs to do anything, and if so, it causes the rule to perform its action.

Specifying the behaviour of each rule is done by subclassing an existing rule and modifying its behaviour using the property system. Prime Mover implements both a class-based property inheritance system and a call stack based system. It is possible for a rule to override a property in such a way that it applies to all rules that are called by that rule. This allows a rule to, for example, build two copies of the same application with different flags, by simply invoking the build rule twice with a property set differently.

Most of Prime Mover's useful functionality exists in the default library.

3.2. The rule engine

When pm wishes to build a rule, it applies the following algorithm to that rule.

build all children of the rule
are any of this rule's inputs newer than this rule?
  yes -> execute this rule's command
does this rule have an install property?
  yes -> execute this rule's installation instructions

The timestamp of a rule is considered to be the newest of all the rule's outputs. A missing output is considered to be infinitely old.

The result of this is a depth first traversal of the dependency tree. A rule's dependents will be built before the rule is, and will always be built in the order supplied when the rule was instantiated.

Because all children under a rule are visited every time the rule is visited, extremely large dependency trees may cause the same rule to be visited a very large number of times, which can be slow.

As a result, it is possible to take advantage of the order of execution to prune the dependency tree. For example, consider the following:

slow_library = lib { ... }
program1 = cprogram { cfile "file1.c", slow_library }
program2 = cprogram { cfile "file2.c", slow_library }
program3 = cprogram { cfile "file3.c", slow_library }
default = group { program1, program2, program3 }

When default is built, slow_library will be visited three times. It will only actually be built once, and only if needed, but it will still cause pm to examine its source and object files three times. The above could be rewritten as follows:

slow_library = lib { ..., install = pm.install("libslow.a") }
program1 = cprogram { cfile "file1.c", "libslow.a" }
program2 = cprogram { cfile "file2.c", "libslow.a" }
program3 = cprogram { cfile "file3.c", "libslow.a" }
default = group { slow_library, program1, program2, program3 }

When default is now built, slow_library will be visited once, and then the three uses of it will refer directly to its output. If one of slow_library's source files is changed, then when default is invoked it will get rebuilt automatically, which will cause libslow.a to be refreshed, which will in turn trigger rebuilding of the three programs.

However, if program1 is invoked directly, it will not be built, because pm has not been explicitly told about the dependency.

As a result, I strongly recommend avoiding this idiom unless you really need it.

pm performs aggressive caching on timestamps; it will only look at the actual file on disk the first time it needs to determine the file's timestamp. It will then remember the result. If the file is used as an output of a rule that has been built, then the file will be considered to be up-to-date, regardless of the timestamp that the rule actually wrote to disk.

3.3. The intermediate cache

pm stores all intermediate files in a hidden directory.

pm keeps track of the command used to generate each intermediate file, and will only execute each command once. The same source file, built with two different compiler command lines, will result in two different intermediate files. This allows multiple versions of a program to be built side-by-side without any object file collision.

The contents of the intermediate cache should be considered as internal to pm. While the names of the files are human readable, this is merely to aid debugging. No legitimate build process will ever need to directly refer to files in the intermediate cache.

As a result of this, pm needs to be told which files are the result of a build, so that it can copy them out of the intermediate cache and into the user's desired location. This is done using the install property on a node.

Each time a rule is visited, its install property is consulted. If this exists, then the contents are executed. install normally contains a list of commands to be executed; see the documentation for node for more information on the exact syntax.

For example:

myprogram = cprogram {
  cfile "file.c",
  install = pm.install("a.out")
}

When built, this rule will compile file.c as a C program and then install the result to a.out in the current directory.

install is executed every time the rule is visited. This means that if the rule is visited several times, then the file will be installed several times. As a result, care should be taken so that rules with install set are only visited once; following on from the above example, this can lead to confusion:

default = group {
  group { CFLAGS="-DFOO", myprogram },
  group { CFLAGS="-DBAR", myprogram }
}

This is going to build two versions of myprogram, one with FOO defined and one with BAR defined. Each one will be installed as a.out. It is not defined which one wins.

4. String expansion

4.1. Introduction

Just before filenames or command lines are processed by Prime Mover, they undergo string expansion. This causes marked sections of the strings to be replaced with other values; this is equivalent to variable expansion in make. String expansion always occurs in the context of the rule currently being processed; the value of the expanded string will depend on the rule's properties.

Each string expansion consists of the name of a property delimited with % characters. The name may have an optional selector clause, delimited with [] characters, and an optional modifier clause, separated with a ::

%NAME%
%NAME[selector]%
%NAME:modifier%
%NAME[selector]:modifier%

For example, the following strings are candidates for string expansion:

%OPTIMISATION%
%inputs[1]%
%FILENAME:basename%

The algorithm used to perform string expansion is as follows.

For each %...% clause:
 - fetch the value referred to by the name.
 - if the value is a list, apply the selector.
 - if a modifier is present, apply the modifier.
 - replace the %...% clause with the result.

If the result is a list, and is not one of the special forms described below, then the replacement text consists of all items in the list, quoted and separated with whitespace. If the result is not a list, the replacement text consists of that item, unquoted.

The distinction between single-item lists (which are quoted) and strings (which are not quoted) is important, and is used extensively to build command lines from multiple string expansions. For example, the C plugin uses the CC property to store the command line used to invoke the C compiler:

CC = "%COMPILER% %INCLUDES% -c -o %out[1]% %in[1]%"

When the C plugin expands "%CC%", this will be replaced with the literal string above. If quoting were to happen, the end result would not be a valid command line.

This is known to be inelegant.

String expansion occurs recursively until there are no more %...% clauses to be expanded. This does mean that if a string expansion refers to itself, an infinite loop will result.

There is also a third string expansion form.

%{text}%

When a string expansion of this form is encountered, then text is executed as a chunk of Lua code, and the return value is used as the result. The code uses a read-only copy of the pmfile's global environment with self set to the current rule. Rule properties may be looked up by calling the __index() method on the rule. For example:

"the number is %{return 1 + 1}%"
"CFILE is set to %{return self:__index('CFILE')}%"

When a property is looked up in this way, it uses the same mechanisms as string expansion, but the value is not converted into a string and remains a table or raw value. (This means that {PARENT, ...} and {REDIRECT, ...} are honoured.)

If you want a literal % in a string, then use %%. (Only in version 0.1.4 and above.)

4.2. Property lookup

The first stage of string expansion is to look up the value of the property being referred to. The way this happens depends on the name of the property.

Property names in lower case refer to local properties. Property names in upper case refer to inherited properties.

The algorithm used is as follows:

does the named property exists on the current rule?
  yes -> return it
  no -> try again recursively on the rule's superclass
if still not found and the name refers to an inherited property
  does the named property exist on the current rule's caller?
    yes -> return it
    no -> try again recursively up the rule's call stack
  if still not found
    does the named property exist as a global property?
      yes -> return it
      no -> fail with an error message

To summarise: all properties are first looked up via the rule's class hierarchy; inherited properties are then looked up via the rule's call stack; if all else fails, inherited properties take the value of the appropriately named global property.

To rephrase:

4.3. Selectors

Selectors are used to extract particular items from a list. It is not legal to use a selector on a value that is not a list.

The syntax is:

[from-to]
[index]

For the first form:

from and to are numeric constants identifying the first and last item from the list. Entries are counted from 1. If values are specified, they must be within the bounds of the list.

from and to may also be omitted, in which case they default to the start and end of the list, respectively. "%name[2-]%" will look up name and return all but the first item.

The selector [-] (with both from and to omitted) will return the entire list, and is equivalent to not having a selector at all.

For the second form:

index is a numeric constant identifying the single index of the item to extract from the list. It is equivalent to [index-index]. The selector will return a single-entry list containing the item, not the item itself.

4.4. Modifiers

Modifiers are functions that can be applied to the result of the string expansion before the replacement takes place. Each modifier is a name referring to a function. They do not take arguments.

It is possible for a pmfile to add its own string modifiers, by defining functions in the pm.stringmodifier table. Each function takes two arguments; the current rule, and the value the modifier is to be applied on. For example:

function pm.stringmodifier.typeof(rule, argument)
    return type(argument)
end
"The type is %NAME:typeof%"

It is strongly recommended that string modifiers work equivalently on raw strings and single-item lists.

The following string modifiers are available:

4.4.1. dirname

Takes a string, or a list with one item. Assumes the argument contains a pathname and returns the directory part of it only.

4.5. Special forms

The following property values are special.

4.5.1. EMPTY

This is a special form of the empty list. It differs from {} in that, when expanded, {} produces "" and EMPTY produces nothing.

For example: a rule may invoke a C compiler as follows:

"gcc -o %O% %I% %CFLAGS%"

%O% expands to the name of the output file, %I% expands to the name of the input file, and %CFLAGS% expands to a list of compiler options. If I="in.c", O="out.o" and CFLAGS={"-g", "-Os"}, then this would produce:

gcc -o out.o in.c "-g" "-Os"

However, if no flags were defined, and CFLAGS defaulted to {}, then this would produce:

gcc -o out.o in.c ""

This is incorrect. The last argument would be treated by gcc as a blank source filename, and an error would result. To prevent this, the default calue of CFLAGS should be EMPTY. This would cause the expanded string to be suppressed, resulting in the correct string:

gcc -o out.o in.c

4.5.2. {PARENT, values...}

This form is used to append values to an existing list. It only works on inherited properties.

When the call stack is being traversed while looking for an inherited property, normally traversal stops at the first value found. However, if the value is a list of the special form described here, then traversal continues, and the values specified get appended to the older values.

For example:

CFLAGS = EMPTY
default = rule1 {
  CFLAGS = {"-g"},
  rule2 {
    CFLAGS = {PARENT, "-Os"},
    string = "%CFLAGS",
    ...
  }
}

When string is expanded in the instantiation of rule2, the result will be {"-g", "-Os"}.

{PARENT, ...} forms may be nested arbitrarily.

4.5.3. {REDIRECT, name}

This form, when seen, causes another named property to be looked up instead of the current one. For example:

CFLAGS = {"-g", "-Os"}
CXXFLAGS = {REDIRECT, "CFLAGS"}

Unlike string expansion, this allows {PARENT, ...} to append items to the list. (String expansion forces the result into a string form.)

5. The standard library

The following rules are supplied as part of the Prime Mover core functionality.

5.1. node

The ultimate superclass of all Prime Mover rules.

It is very unlikely that anyone will ever want to instantiate or subclass node; it does not actually do anything useful. However, it provides a number of services that subclasses will use.

Recognised properties

ensure_n_children

A number. If set, then when this rule is instantiated, a check will be made that it has exactly this number of children.

ensure_at_least_one_child

A number. If set, then when this rule is instantiated, a check will be made that it has at least one child.

construct_string_children_with

A rule. If set, then any children that are bare strings and not rules will be replaced with instantiations of the rule, such that a child "child" will be replaced with rule "child".

Defaults to file.

all_children_are_objects

A boolean. If set, then when this rule is instantiated, a check will be made that all children are rules and not bare strings. This check is done after construct_string_children_with is processed.

Defaults to true. You will almost certainly never want to turn this off.

class

A string. Contains the name of this class.

This property has two purposes: firstly, it is used to produce meaningful error messages; and secondly, if this property is set when instantiating a rule, it signals Prime Mover that you are defining a new rule class and not producing a rule that is going to be used directly. This causes the checks described above (and others) to be bypassed.

install

A string, or pm.install(), or a list of strings or pm.install(). Indicates special action to be taken after building this rule.

Strings are treated as shell commands. String expansion is performed normally; %out% expands to a list of the current rule's outputs.

The special value pm.install(src, dest) may be used instead of a string. When executed, this causes pm to perform an optimised copy, and is the preferred way of doing an installation. src and dest are strings specifying the source and destination files; string expansion is performed normally. src may be omitted, in which case it defaults to %out[1]% (the first of the rule's output files).

Strings and pm.install may be combined at will in a list.

Any commands specified in install are executed every time the rule is visited by pm, even if nothing actually needs to be built. Don't use any commands that consume significant amounts of time to execute.

5.2. file

Refers to a static file on the filesystem.

Does nothing when built. Is as old as the file is.

This rule is used in the leaves of the dependency tree to refer to source files. It is illegal for a file to refer to a non-existant file.

Superclass

node

Inputs

One or more constant strings, that are filenames relative to Prime Mover's current directory.

Outputs

The files referred to.

Recognised properties

only_n_children_are_outputs

A number. If set, then specifies the number of children to be used as outputs; any additional children will be ignored.

Obsolete and will go away soon. Use ith instead.

5.3. group

Groups together a number of children. Does nothing when built.

Superclass

node

Inputs

An arbitrary number of children.

Outputs

The same as its inputs.

Example

rule1 = ...builds something...
rule2 = ...builds something else...

-- Building default causes both rule1 and rule2 to be built.
default = group {
    rule1,
    rule2
}

This rule is also commonly used to allow properties to be overridden when calling a child.

group is equivalent to ith with no parameters.

5.4. ith

Selects one or more of its children. Does nothing when built.

Superclass

node

Inputs

An arbitrary number of children.

Outputs

Some or all of its inputs.

Recognised properties

ith responds to the following local properties:

ia number describing the single child to select. May not be used in combination with from and to.
froma number describing the first child of a range to select. May not be used in combination with i.
toa number describing the last child of a range to select. May not be used in combination with i.

from and to default to the first and last child respectively. ith with no parameters is equivalent to group.

Example

somerule = simple {
    -- This rule runs a script which outputs three files. However, we
    -- the first one is a text file that cannot be compiled.
}

default = cprogram {
    ith {
      -- Use of from causes the first child to be ignored; cprogram{}
      -- will only see the last two.
      from = 2,
      somerule
    }
}

5.5. foreach

Applies a rule to all of its children.

foreach instantiates all of its children with a particular rule, and groups their outputs together like group.

It is useful if you have another rule that is returning an arbitrary number of outputs, each of which needs to be built independently. For example, a script may automatically generate several C files.

Superclass

node

Inputs

Zero or more rules.

Outputs

The accumulated outputs of the instantiated rule r after applying r to all the inputs.

Recognised properties

rule

The rule to apply to foreach's children. Mandatory.

Example

myScript = simple {
    ...this script automatically generates some (maybe a lot) of C files.
}

default = cprogram {

    -- Run myScript, and build all of its results with cfile.
    foreach {
        rule = cfile,
        myScript
    }
}

5.6. simple

Builds its children using a user-supplied shell command.

This rule, and subclasses of this rule, are where most of the work of a pmfile will be executed. The rule cfile, which builds a C file into an object file using gcc, is a subclass of simple.

To use simple, you must specify:

If the command returns a error code, then any output files produced by the rule will be automatically removed.

Superclass

node

Inputs

One or more rules.

Outputs

One or more rules.

Recognised properties

outputsA list of strings describing the output files this rule makes.

Each string describes the name of a file in the intermediate cache. String expansion is performed, but you may only use %in%, %I% and %U%. %in% expands to the list of input files; %I% expands to the leaf part of the first input file, without the extension; %U% expands to a unique identifier that represents the command being executed. %O% may not be used. Every template must start with %U%. Filenames may contain / characters --- directories are created automatically.

The minimum legal output template is {"%U%"}. This defines one output with a machine-generated name. However, this is unfriendly when debugging, so it is usually recommended to use {"%U%-%I%.x"} where .x is replaced with the appropriate extension for the output file.

commandA string, or a list of strings, containing the commands to be executed when building this rule.

Commands are executed via the standard shell (/bin/sh on most platforms). If multiple commands are provided, then they are glued together with && before being executed. String expansion is performed; %in% expands to the list of input files, and %out% expands to the list of output files (as defined by the outputs property). %I% and %U% may not be used.

Example

This example uses tr to convert files from upper case to lower case.

uppercasefnord = simple {
  outputs = {"%U%-%I%.txt"},
  command = "tr 'A-Za-z' < %in[1]% > %out[1]%"
  "fnord.txt"
}

Here is the sample example using a rule subclass.

uppercase = simple {
  class = "uppercase",
  outputs = {"%U%-%I%.txt"},
  command = "tr 'A-Za-z' < %in[1]% > %out[1]%"
}

uppercasefnord = uppercase "fnord.txt"

This example provides a rule that uses head and tail to split a file into three parts.

topandtail = simple {
  class = "topandtail",
  outputs = {
    "%U%-%I%/top.txt",
    "%U%-%I%/middle.txt",
    "%U%-%I%/bottom.txt"
  },
  command = {
    "head -20 < %in[1]% > %out[1]%",
    "tail +20 < %in[1]% | tail -20 > %out[2]%",
    "tail -20 < %in[1]% > %out[3]%"
  }
}

This command uses cc to compile a C source file into an object file.

simple_cfile = simple {
  class = "simple_cfile",
  outputs = {"%U%-%I%.o"},
  command = "cc -c -o %out[1]% %in%"
}

5.7. deponly

Builds its children, but does not return them.

This rule acts like group, but has no outputs. When built, it will build its inputs, will also cause its caller to rebuild itself.

This is occasionally useful if you have a rule whose command is referring to files outside of Prime Mover's control; for example, a C file will refer to header files. You want the C file to be rebuilt if any of the headers change, but you do not pass the header files directly into the C compiler. deponly allows you to refer to the header files without actually using them.

Superclass

node

Inputs

One or more rules.

Outputs

None.

Example

example_cfile = simple {
    ...command to compile the first child as a source file...

    -- If input.c changes, the rule will rebuild.
    file "input.c",

    -- If header.h changes, the rule will also rebuild, even
    -- though header.h is not used by the rule.
    deponly {
        file "header.h"
    }
}