Designing Software With Complex Configuration
Designing Software With Complex Configuration
Alcino Cunha
July 2024
1 Introduction
The design phase is key to achieve high-quality software systems. The term design can mean
arXiv:2407.13633v1 [cs.SE] 18 Jul 2024
different things to different people, but for me it is just the synonym of an abstraction or model
of the system to be built. Different levels of formality can be used to define such models, but
to properly reason about the desired properties of the system we need rigorous mathematical
models. These are particularly relevant if the system is mission- or life-critical, and computer
science has a long tradition of applying so called formal methods in the design of those kinds of
software systems. However, as I will show later, some rigorous notations and tools for describing
and analyzing software designs are now so simple and usable, that there is little excuse to not
apply them to the design of most software, critical or not. Maybe that could help prevent many
of the misconceptions and flaws that plague software applications nowadays.
Software systems typically operate in a given configuration. By configuration, I mean a set of
parameters or conditions that affects the behavior of the system and whose value is assumed to
be constant for a while: in reconfigurable systems this a while can be a (reasonable) time interval,
but in many cases it is the whole life-span of the system, once configured and deployed.
Just to make the term configuration more precise, here are some examples of configurations in
software systems with different granularity: in a software product line designed against a particular
feature model, the configuration could be the set of features that will be selected in a given
deployment; in the software controlling a railway interlocking system, the configuration could be
the specific railway network (and signaling layout); in a replicated database, the configuration
includes at least the number of replicas; in a distributed protocol, the configuration could be the
network topology where the protocol will run; finally, in a concurrent algorithm, the configuration
could be just the number of executing processes.
Some of these configuration are extremely simple, for example the number of processes in a
concurrent algorithm, but others are quite complex, for example the railway network in interlocking
systems. Designing systems to work with such complex configurations is not trivial, because ideally
we should ensure that the expected properties are satisfied whatever the configuration. And
sometimes it is not even trivial to enumerate all possible configurations, for example, enumerating
all valid feature combinations of a feature model.
In this paper I will discuss if and how existing formal methods can be applied to the design of
software with such complex configurations. I also intend to show that some formal methods are
already cost-effective enough to be applied in the design of any software system, not just critical
ones. For that reason, I will focus on methods that can be applied by any software engineer
with the standard background on logic and discrete math, and will immediately rule out many
heavyweight formal methods, namely those focused on obtaining full proofs (aka theorem provers)
and that usually require expert user input. Instead, I will focus on lightweight formal methods,
that provide automatic analyses, with the tradeoff that in many cases they will only be able to
achieve partial proofs. Notice that these are still vastly superior to testing: first, many of the
systems nowadays are distributed or concurrent, and bugs in those systems are typically due to
very sporadic race conditions that are almost always impossible to catch with normal testing;
second, even in a full deterministic system, they will be able to verify properties for all possible
configurations up to a given size, as opposed to testing just a few configurations.
1
To make the presentation more concrete I will use as running example the Echo distributed
protocol first proposed by Chang [5]. This protocol aims to define a spanning tree in an arbitrary
connected network of nodes with a distinguished initiator. Such spanning tree could be used,
for example, to define routing tables for posterior communication. Chang describes the protocol
roughly as follows. The initiator starts by sending an explorer message to all its neighbors. If a
node receives an explorer and it is the first to arrive at the node, mark the sender as its parent
and send an explorer to all neighbors except the parent. If a node receives an explorer but it is
not the first to arrive or there are no neighbors except the sender, send an echo message back to
the sender. If the received message is an echo, register that an echo as been received from the
sender, and if echos from all neighbors (to which explorers were sent) have arrived send an echo
to the parent, unless the node is the initiator, in which case the protocol is finished. The main
properties of interest in this protocol are (partial) correctness – when the protocol finishes the
parent relationship indeed forms a spanning tree rooted at the initiator – and termination – the
protocol eventually finishes. Recall that these should hold for every possible configuration, in this
case, for every possible connected network and initiator.
In our example, the protocol configuration will be specified by three constants: Node, the
nodes of the network; Initiator, the initiator node; and adj,
the neighbor relationship that defines
the network topology. For the state we have three variables, that will store for each node: its
parent,
if any; the set of neighbors from which echos have already been r eceived; and the set
of unprocessed messages in the respective inbox. This protocol should work correctly even if
messages are delivered out of order, something we will abstract by having the inboxes contain sets
of messages and letting nodes pick any of them to process at each time.
TLA+ is an untyped language and variables can take any value from standard mathematical
types such as booleans, numbers, sets, or functions. By encoding them as functions, TLA+ also
2
supports records (functions from field names to the respective values) and sequences (functions
from indexes to values). In our example, Node
will be a set of node identifiers, Initiator a single
node, and the remaining constants and variables will be functions associating each Node with the
respective information. To manipulate expressions of each type, TLA+ provides specific syntax
and operators, sometimes resorting the well-known LaTeX syntax (for example for set operators).
In this document, the latter will be rendered with the standard mathematical notation, while the
standard ASCII notation is used for the remaining. I believe that the syntax and semantics of
most operators is trivial to understand for most readers with the usual background in logic and
discrete math, and so I will use them without further explanation.
After declaring the constants, we can specify the assumptions about the protocol configuration
using the ASSUME keyword.
reachable(n) ==
LET aux[i ∈ Nat] ==
IF i = 1 THEN adj[n]
ELSE aux[i-1] ∪ { x ∈ Node : ∃ y ∈ aux[i-1] : x ∈ adj[y] }
IN aux[Cardinality(Node)]
Most of the assumptions are rather trivial to specify: the Initiator must be one of the nodes;
relation is a function from Node to subsets of Node (expression [A -> B] denotes the set
the adj
of all functions from A to B,
and SUBSET
A the power-set of A); the adjacency relation does
not contain self loops; and is also symmetric (the network is an undirected graph). However, the
assumption that the network is connected is a bit more tricky to formalize. Since TLA+ does not
provide native closure operators, we must define the set of nodes reachable from any node using
recursion: definition r eacheable(n) first defines recursively an auxiliary function aux
that computes
the set of nodes reachable from n in at most i steps, and then determines the set of nodes that
that are reachable from n in at most Cardinality(Node) steps using that function. Because of this
definition we need to import the standard modules Naturals (to use the arithmetic operators) and
FiniteSets (to use the Cardinality
operator), using the EXTENDS keyword.
EXTENDS Naturals, FiniteSets
Since the language is untyped it is easy to make mistakes when manipulating variables. It is
customary to include in a TLA+ specification a predicate that checks if the state variables contain
values of the appropriate type. This predicate is frequently named TypeOK and can later be
model-checked to verify its invariance. Although a poor substitute for static type checking, this
can help find some typing errors, but also serves as documentation and it is good practice to
declare it early in the specification. In our example we have the following TypeOK predicate.
CONSTANT None
ASSUME None ∈
/ Node
3
predicate checking the type of inbox, expression |[from : Node, type : "Explorer","Echo"]| denotes
the set of all records where field from is a node identifier and t ype is one of the two strings.
With TLA, a transition system can implicitly be specified with a temporal formula that specifies
which are its valid behaviors (or traces), being a behavior an infinite sequence of states starting
in an initial state and where each pair of consecutive states respects the transition relation. This
temporal formula has the shape I ∧ □[A]t , where I is a state predicate that specifies which are
the valid initial states, □ is the standard always LTL operator that enforces the enclosed formula
to be true in all states of a behavior, A is an action that specifies if two consecutive states satisfy
.
the transition relation, and [A]t = A ∨ t′ = t is a formula that holds if A holds or t has the same
value in the pre- and post-state. Note that, whatever t, [A]t will always allow stuttering steps,
where all variables keep the same value. If t is a constant then t′ = t is a tautology and the system
will be able to evolve freely, with all variables potentially changing their value in an arbitrary
way. However, usually t is a sequence with all state variables, and in this case formula □[A]t will
specify that it is always the case that the system performs an A step or stutters. In TLA, actions
cannot be used freely in formulas, and can only be used inside the operator [ · ]t . This is to ensure
that all TLA formulas are invariant under stuttering, and thus will not have their validity affected
by refinement (for example, adding extra variables to a system). Lamport claims that this is a
fundamental property of a logic to specify systems, usually illustrating his claim with a very simple
example of a clock specification. A clock showing only the hours can be specified with a single h
variable as follows.
0 ≤ h < 24 ∧ □[h′ = (h + 1) % 24]h
Initially we have h with an arbitrary hour and then the system evolves by changing the hour
as expected. If this specification did not allow for stuttering steps, then a clock with hours and
minutes would not satisfy it, because the hour would be forced to always keep changing.
In a larger system with many different events it is usual for the action A in the specification
I ∧ □[A]t to be a disjunction of more specific actions specifying the different events. That is the
case of the specification of this protocol, where the action that specifies the transition (or next
state) relation (denoted Next below) is a disjunction of all possible events that can occur in all
possible nodes. In this case, two possible events can occur at each node, namely receive an explorer
or an echo message.
vars == << received, parent, inbox >>
Next == ∃ n ∈ Node : receiveExplorer(n) \/ receiveEcho(n)
Spec == Init /\ [][Next]_vars
Here, vars
is a sequence with all the state variables that should remain unchanged in a stuttering
step. The initial state predicate can be specified as follows.
msg(n,t) == [ from |-> n, type |-> t ]
Init == \* no parents
/\ parent = [ n ∈ Node |-> None ]
\* no received echos
/\ received = [ n ∈ Node |-> {} ]
\* initiator starts by sending explorers to all neighbors
/\ inbox = [ n ∈ Node |-> IF n ∈ adj[Initiator]
THEN {msg(Initiator,"Explorer")}
ELSE {} ]
The initial value of the three variable functions is defined by comprehension. Since the protocol
starts with the initiator sending a ping to all its neighbors, the initial value of function inbox is
specified accordingly by comprehension.
To give an example of how to use actions to specify an event, consider the definition of
r eceiveExplorer(n) that specifies the event of node n receiving and processing an explorer message.
receiveExplorer(n) == ∃ m ∈ inbox[n] :
/\ m.type = "Explorer"
/\ IF parent[n] = None
THEN parent’ = [ parent EXCEPT ![n] = m.from ]
ELSE UNCHANGED << parent >>
4
/\ IF parent[n] = None /\ adj[n] \ { m.from } # {}
THEN inbox’ = [ a ∈ Node |->
CASE a ∈ adj[n] \ {m.from} -> inbox[a] ∪ {msg(n,"Explorer")}
[] a = n -> inbox[a] \ {m}
[] OTHER -> inbox[a]
]
ELSE inbox’ = [ inbox EXCEPT ![n] = @ \ {m}, ![m.from] = @ ∪ {msg(n,"Echo")} ]
/\ UNCHANGED << received >>
The specification of an event usually comprises three kinds of conjuncts: guards, formulas specify-
ing when can the event occur; effects, that specify which variables change and how do they change;
and frame conditions, that specify which variables do not change. In this case, the guard states
that the event can occur if there exists an explorer message in the inbox of n. When the event
occurs there is an effect on two of the state variables: if it is the first explorer to be received, the
sender is set as the parent
of the node (the EXCEPT keyword is used to define a function that
is equal to another except for a given domain element, with @ denoting its old value); if it is the
first explorer and there are more neighbors besides the sender an explorer is sent to all of those;
otherwise an echo is sent back; in any of these cases the received message is deleted from the
inbox. Finally, a frame condition specifies that this event does not change the value of r eceived.
The keyword UNCHANGED is used to specify that a variable or a tuple of variables does not
change its value.
Before verifying the expected properties it is important to validate the protocol specification.
TLA+ does not have an interactive simulator that allows the user to validate a specification. The
recommended validation technique is to write false properties that force the TLC model-checker to
provide as counter-examples some expected scenarios. For example, we could specify a property
stating that it is impossible for the protocol to finish, and, if the protocol is well-specified, a
counter-example for this property will be a minimal sequence of steps where the protocol runs to
completion.
Finish == received[Initiator] = adj[Initiator]
MinimalRun == [](~Finish)
To see such a scenario for a complete network with 4 nodes (a, c, and d),
b, we first need to
instantiate the constants Node, Initiator, and adj accordingly. Then we can ask TLC to verify
property MinimalRun, and a trace with 19 states is returned. Counter-examples are displayed in
a textual format, highlighting the variables that change value at each state, as seen in Figure 1
(showing only the first 3 states). It is possible to hide or expand the value of a variable, as done for
the inbox in the second state. In this trace the first event consisted of node b reading the explorer
message sent by node a (the initiator): after this event, its inbox is empty and the inboxes of
nodes c and d contain explorers sent both by the initiator a and by b.
Having validated our protocol specification, we can proceed to the verification of the two
expected properties, correctness and termination. The former is an example of a safety property,
and can be specified as follows.
SpanningTree == ∀ n ∈ Node \ {Initiator} : Initiator ∈ ancestors(n)
Correctness == [] (Finish => SpanningTree)
The auxiliary definition SpanningTree states that all nodes must have the initiator as an ancestor.
To determine the ancestors, again we need to resort to a recursive (omitted) definition similar to
the one that was defined for computing the reachable nodes.
Since the specification of the protocol admits stuttering steps, to verify termination we need
to impose fairness restrictions that force the system to make progress. In this case it suffices to
require weak fairness for the Next transition relation, meaning that while there is at least a node
with unprocessed messages the algorithm cannot stutter forever. TLA+ has specific operators to
specify fairness conditions (WF for weak fairness and SF for strong fairness), but they can be
specified directly using the temporal connectives and the ENABLED operator (that checks if an
action is enabled in a state). The termination property can thus be specified as follows.
Termination == WF_vars(Next) => <> Finish
5
Figure 1: Minimal run in a complete network with 4 nodes
TLC is extremely efficient, in particular when verifying invariants. For the configuration con-
sisting of a complete network with 4 nodes both properties are checked in around 2s. TLC im-
plements an explicit model-checking technique that traverses the state space, but takes advantage
of multi-core processing to parallelize the search and uses hash tables to summarize states and
reduce the memory footprint. Due to the need to explicitly compute the next states, TLC cannot
however accept any TLA+ specification. In particular, the next state relation must be a disjunc-
tion of actions, each a conjunction of “assignments”: the new value of a (primed) variable can
only be specified with an equality (deterministic assignment) or an inclusion (non-deterministic
assignment), and the assigned expression must only use variables assigned beforehand. Although
these restrictions can make the specification of some systems a bit more difficult, when specifying
concurrent and distributed algorithms they are rarely a limitation.
However, for a complete network with 5 nodes, TLC already takes more than 4m just to
check correctness and around 20m to check termination, even after turning on symmetry breaking
to avoid exploring states that are isomorphic up to a permutation of the node identifiers. For
example, in the complete network with 5 nodes with initiator a it is unnecessary to explore both
states where b is the first to read the explorer or c is the first to read the explorer. By assuming
the set Node is symmetrical only one of those will be explored. In this particular configuration
TLC has to explore around 3 million truly distinct states.
But the main problem with this TLA+ specification is that TLC can only be used to check one
configuration at a time. With up to 4 nodes there are 16 truly different network configurations,
and to check them all the user must manually configure TLC with 16 different instantiations of the
declared constants. Moreover, it is not at all easy to manually enumerate the 16 truly different
(non-symmetrical) networks with up to 4 nodes (we leave that as an exercise to the reader). This
strategy will obviously not scale if we want to verify the protocol with a higher number of nodes
and there is a good chance that a faulty configuration can be missed.
Fortunately, it rather easy to adapt a TLA+ specification to check all configurations. The basic
idea is to change the declared constants to flexible variables, move the assumptions to constraints
on the initial state, and add frame conditions to all events to force those configuration variables to
remain constant in all behaviors. With this change every configuration will give rise to a different
initial state to be explored by TLC. As mentioned above, TLC requires every variable to have
its value defined either with an equality (corresponding to an assignment) or an membership test
(corresponding to a non-deterministic assignment). In our case, this means that we have to add
an extra constraint to Init
to define the initial value of set Node. Since we want to explore all
networks with up to a given number of nodes, we will add a constant Univ with all possible node
identifiers and then assign to Node an arbitrary subset of Univ. Our specification will look roughly
as follows.
6
c
a d
CONSTANT Univ
VARIABLES received, parent, inbox, Node, Initiator, adj
...
...
Unfortunately, when checking the expected properties for a universe of 4 possible node iden-
tifiers (a, c , and d)
b, we immediately get a counter-example to property Correctness
for the
following configuration, corresponding to the network in Figure 2 (the initiator is shown in green).
Node = {a,b,c,d}
Initiator = a
adj = a :> {b, c} @@ b :> {a, c, d} @@ c :> {a, b, d} @@ d :> {b, c}
In fact, out of the 16 non-symmetrical configurations only 2 of them are buggy, the other one
being similar to the one on Figure 2, but with the edge from d to c removed. Note that, if the
initiator was node b (or symmetrically c ) then the protocol would work fine. Without an exhaustive
verification of all configurations I recon it would be very unlikely that an user would choose to
verify the protocol for these two configurations and the bug would most likely go unmissed.
So, is Chang’s protocol really faulty? Not really... Somewhere in the paper, when presenting
general definitions it is mentioned that the initiator should be considered visited a priori. In the
protocol description it is not clarified how this should be implemented, but a reasonable assumption
could be to consider the initiator to already have a parent at the beginning. This can be specified
as follows in the Init predicate (we will assume the initiator is its own parent).
Init == /\ ...
\* initiator is explored a priori
/\ parent = [n ∈ Node |-> IF n = Initiator THEN Initiator ELSE None ]
/\ ...
7
Nodes Candidates Non-symmetric Symmetric
1..3 1570 21 5
1..4 263714 216 16
1..5 168035874 4545 74
1..6 412484896290 ? 531
Checking our protocol again for all networks up to 4 nodes now yields no counter-example,
and verification takes only a few seconds (to check both Correctness
and Termination). Checking
Correctness for all networks up to 5 nodes takes around 13m. This is not surprising as a single
configuration takes around 4m. Checking Termination takes a bit over 1h.
However, if we try to check the protocol for networks with up to 6 nodes, we get stuck: after
one hour TLC still does not even finish to enumerate the possible initial states (corresponding to
the different configurations). As we can see in Table 1, the problem is that the number of non-
symmetric configuration is just a tiny fraction of all the possible valuations of the configuration
variables. When dealing with non-deterministic assignments, either in initial states or in the
effect events, TLC just enumerates explicitly all possible candidates until finding those that satisfy
the required conditions. Moreover, unlike in the state graph exploration, this enumeration is
not parallelized. It is surprising that it can even finish to enumerate all the 74 non-symmetrical
configurations for networks with up to 5 nodes in less than 1m (a testament to TLC efficiency), but
obviously this technique cannot scale up to more complex configurations. For networks with up
to 6 nodes, this is really just like “looking for a needle in a haystack”. This kind of state-explosion
was precisely the motivation for the development of symbolic model-finding and model-checking
techniques that do not need to explicitly enumerate the state space.
8
tuples of elements contained in a relation should have the same length. Unary relations can thus
be used to represent sets, and singleton unary relations can be used to denote specific elements of
the domain.
In Alloy sets are known as signatures and the elements that inhabit them as atoms. The sig
keyword can be used to declare signatures. Signature declarations can have a multiplicity attached
to restrict the number of atoms they can contain. A signature can also extend another signature,
meaning the former is a subset of the later and disjoint from every other extension. A top-level
signature is one that does not extend another signature. This hierarchy is exploited in the type-
system of Alloy to find irrelevance errors, expressions that always denote empty relations, and
that are most likely specification mistakes [8]. Relations of arity higher than one are known as
fields and are declared inside the domain signature, with an optional multiplicity attached to the
range. For example, to specify the configurations of the Echo protocol in Alloy we could start by
declaring the following signatures and fields.
sig Node {
adj : set Node
}
one sig Initiator extends Node {}
Signature Node will contain the nodes of the network. To distinguish a specific node as the
Initiator we declare a singleton signature extending Node. The neighbors are represented by the
binary relation adj that associates each node with an arbitrary number of nodes.
Assumptions are specified inside facts. These are specified with Relational Logic, an extension
of FOL that, besides the usual quantifiers (all and some in Alloy syntax) and membership and
equality atomic formulas (in and =), supports a few derived operators that ease the specification
of constraints, as well as closure operators that cannot be expressed in FOL. Due to the “everything
is a relation” motto, the Alloy syntax is quite small and also has a very simple semantics – the full
list of operators is presented in Figure 3. Note that set operators, like union or intersection, can
be applied to relations of arbitrary arity. For example, operator in, that determines if a relation
is a subset or equal to another relation, can also be used to check membership of an element to
a set, since the former will be denoted by a singleton unary relation and the latter by a unary
relation (for example, Initiator in Node is a possible atomic formula that holds trivially in our
model).
The most used Alloy operator is dot join, that allows us to navigate through a relation to obtain
related elements. It can by applied to any pair of relational expressions, as long as the sum of
their arities is greater than two. The semantics of this operator is as follows.
Essentially, it concatenates all tuples from R with all tuples from S that have the same last and
first atom, respectively, with the nuance that the matching atom is dropped. Since “everything
is a relation” this operator has multiple uses. For example, if f and g are binary relations that
represent functions, f . g is the same as g ◦ f (function composition), and if x is a singleton unary
relation denoting a particular atom of the domain and f is a binary relation denoting a function,
x . f is the same as f (x) (function application) and f . x is the same as f −1 (x) (inverse application).
For example, Initiator.adj is the set of neighbors of the initiator, adj.Initiator is the set of
nodes that have the initiator as neighbor, and initiator.Node is the set of nodes that have some
neighbor.
The closure operators work only on binary relations and allow us to express reachability prop-
erties. Although not as powerful as the full recursion supported by TLA+ , they are simpler to
use, allow more succinct specifications, and are more amenable for automatic verification. The
transitive closure operator is defined as follows.
Seeing binary relations as edges in a graph, ^R denotes the relation containing all pairs (a, b)
such that there is a path from a to b through R-edges. For example, Initiator.^adj is the set
9
+ union
& intersection
- difference ~ converse
-> cartesian product ^ transitive closure
<: domain restriction * reflexive closure
:> range restriction no nothing
. dot join some at least one none empty set
in subset or equal lone at most one univ universe
= equality one exactly one iden identity relation
(a) Binary operators (b) Unary operators (c) Constants
of all nodes reachable from the initiator. The reflexive transitive operator also includes iden, the
identity relation that maps each atom to itself.
*R ≡ ^R + iden
The assumptions about the network in the Echo protocol can be specified with relational logic
as follows.
fact ValidConfigurations {
// no self loops
no adj & iden
// undirected graph
adj = ~adj
// all nodes reachable from initiator
Node-Initiator in Initiator.^adj
}
As we can see in this example, constraints in Alloy tend to be specified in a very terse style.
The relational logic operators enable a compositional (navigational) style, that usually requires
less quantifiers, or no quantifiers at all as was the case here. This no-variables style is well-
known in the algebra of programming [2] community, where it is known as the point-free style of
programming.
As mentioned above, we can ask the Alloy Analyzer to return instances satisfying our assump-
tions. This can be done with run commands. To ensure decidability all the commands must limit
the size of the domain, by setting a scope for all top-level signatures (the maximum number of
atoms they can contain). The default scope is 3, but can be parametrized in each command. The
Analyzer also encodes by default a symmetry breaking mechanism that will try to exclude from
the analysis all isomorphic instances up to renaming of atoms. For example, to get instances of
networks with up to 5 nodes we could execute the following command.
run Example {} for 5
By default, the Analyzer shows instances graphically, with atoms being depicted as nodes
and binary relations as edges between them. Atoms are automatically named according to the
signatures they belong to. This visualization can be customized with themes, for example setting
different colors and shapes for particular signatures. Figure 4 depicts two of the instances returned
by the previous command, with the initiator customized to be shown in green.
We can instruct the Analyzer to search for specific instances. For example, the following
command will directly return a complete network with 4 nodes (the one on Figure 4a).
run CompleteNetwork {
all n : Node | Node-n in n.adj
} for exactly 4 Node
After validating our model with run commands, we can instruct the Analyzer to check ex-
pected assertions, for example, to check that all networks satisfying fact ValidConfigurations
are necessarily connected.
10
(a) A complete network with 4 nodes (b) A line network with 5 nodes
assert Connected {
all n : Node | Node-n in n.^adj
}
check Connected for 4
In this case we checked this assertion for all networks with up to 4 nodes, and no counter-example
was returned, as expected. Although not a full proof, the scoped analysis supported by Alloy is
still very useful in practice, since it exhaustively checks all instances up to a given size and most
false assertions can be refuted with very small counter-examples, something known as the “small
scope hypothesis”.
11
user when specifying the behavior, even if for most cases the typical pattern of specifying actions
a la TLA will be followed.
In the Electrum Analyzer [3] we added support for iteration over trace instances and counter-
examples, support for bounded model-checking of arbitrary temporal formulas via translation to
SAT, and support for complete model-checking by translation to the symbolic model-checkers
NuSMV and nuXmv. More details about the implementation of these features can be found in [15].
Perhaps due to its simplicity, Electrum ended up being selected by the Alloy board in 2022 to be
the official Alloy 6 version of the language.
Alloy 6 excels precisely in the design of systems with complex configurations and data struc-
tures, that are better described with relations and relational logic. As we have seen in the previous
section, it is rather trivial to specify the configurations of the Echo protocol in Alloy. We will now
show how the new features of Alloy 6 allow us to also specify and verify its behavior.
Likewise in the TLA+ specification, to model the dynamics of the Echo protocol in Alloy 6 we
can declare three mutable relations, each associating a node with: at most one parent; the set
of nodes from which echos have already been received; and the set unprocessed messages in the
respective inbox. The first two are binary relations. The latter is a ternary relation that associates
nodes with the types of messages in the inbox and the respective senders. An abstract signature
is also declared for denoting the types of messages, extended by two singleton signatures, one
for each type. This “abstract signature with singleton extensions” pattern is the standard way of
declaring an enumerated type in Alloy.
sig Node {
adj : set Node,
var parent : lone Node,
var received : set Node,
var inbox : Node -> Type,
}
one sig Initiator extends Node {}
To specify the behavior of the protocol we will use the same style of TLA+ : a predicate without
temporal operators to restrict the valid valuations in the initial state, conjoined with an invariant
restricting the valid transitions, a disjunction of all possible events that can occur.
pred Init {
// initiator is explored a priori
parent = Initiator->Initiator
// no received echos
no received
// initiator starts by sending explorers to all neighbors
inbox = Initiator.adj->Initiator->Explorer
}
pred Next {
some n : Node | receiveExplorer[n] or receiveEcho[n]
}
fact Spec {
Init and always (stutter or Next)
}
As mentioned above, in Alloy 6 stuttering is not mandatory, but in this case we opted to include
it as a possible event in every state to make the specification similar to the one in TLA+ . As usual
in temporal logic, Alloy 6 requires all traces to be infinite. Adding stuttering as a possible event
at every state is a trivial way to ensure that. However, unlike with TLA+ , we could restrict
stuttering to only occur in particular states, for example, only when the protocol is finished. In
the Init predicate we can see how Alloy’s relational logic simplifies the specification of constraints:
to specify that the initiator starts by sending explorers to all neighbors, the initial state of the
inbox is restricted to contain all triples that result from the cartesian product of Initiator.adj
(the set of neighbors of the initiator) with the singletons Initiator (the sender of the messages)
and Explorer (the type of messages).
12
The receiveExplorer event can be specified as follows.
pred receiveExplorer[n : Node] {
some m : n.inbox.Explorer {
no n.parent implies {
parent’ = parent + n->m
} else {
parent’ = parent
}
no n.parent and some n.adj - m implies {
inbox’ = inbox - n->m->Explorer + (n.adj-m)->n->Explorer
} else {
inbox’ = inbox - n->m->Explorer + m->n->Echo
}
}
received’ = received
}
Again we followed a style very similar to the one in the TLA+ specification, namely specifying the
full next-state value for each mutable relation. Note how relational logic again allows us to write
some constraints more succinctly when compared to TLA+ (in particular the specification of the
new value of inbox in the case a node has to propagate an explorer to all its neighbors). However,
with the Alloy 6 Analyzer we don’t have the same restrictions imposed by TLC when specifying
events. In particular it is not mandatory to always define the value of every mutable relation using
an equality or an inclusion test. For example, many Alloy 6 users actually prefer to specify only
the local effects in a relation and specify the frame conditions in a separate constraint. Instead of
specifying the new value of parent using the restriction parent’ = parent + n->m, that considers
the global value of parent at once, we could have specified the local effect on the receiving node as
n.parent’ = m and separately specify the frame condition for all other nodes with the restriction
all x : Node - n | x.parent’ = x.parent. None of these restrictions would be accepted by TLC.
Also, unlike with TLC, we can directly ask for execution examples without having to specify
false properties that produce the expected scenario. In particular, we can ask the Analyzer to
return an arbitrary scenario with the following command.
run Simulation {}
In this command the default scope of 3 is being used, so the returned scenarios will be for networks
with up to 3 nodes. The first returned execution trace is shown in Figure 5a, already customized
with a theme that shows the values of the inbox and received as attributes of the nodes. As
can be seen, the Alloy 6 Analyzer depicts one transition (or step) of the trace at a time, showing
side-by-side in a split-panel the pre- and post-state to make it easier for the user to understand the
event that occurred. On top of this split-panel a depiction of the trace is also shown, identifying
the different states and which segment of the trace is repeated indefinitely. The user can choose to
focus on a particular transition by clicking directly any state or navigating forward and backward.
In this case we can see that the first returned trace consists of an infinite sequence of stuttering
steps in a network with 2 nodes, a valid execution of the protocol according to our specification.
The Alloy 6 Analyzer supports several trace iteration operations that allow the user to explore
interactively different scenarios [4]. When no mutable relations are declared the Analyzer defaults
to the standard Alloy 5 visualizer that only supported a single iteration operation (just asking
for any different instance). In particular, we have the following operations for exploring traces:
New Config, that returns an example trace for a different configuration (a different valuation to
the immutable relations); New Trace, that returns any different trace for the same configuration;
New Init, that returns a trace for the same configuration but with a different initial state; and
New Fork, that returns a new trace for the same configuration that is the same up to the shown
pre-state, but different afterwards. New Fork is particularly useful when exploring scenarios in-
teractively, since it allows the user to see which different events are possible at each state. For
example, in the first returned instance the user could ask for a New Fork to see a non-stuttering
event. The resulting trace is shown in Figure 5b, where the first event now corresponds to the
non-initiator node receiving the explorer message, setting the initiator as its parent, and sending
13
(a) First returned instance (b) Result after New Fork
(c) Result after New Config (d) Result after New Fork
back an echo. Selecting New Config could yield the trace in Figure 5c, where we now have a infinite
stuttering trace for a line network with 3 nodes. Selecting New Fork on this trace returns the trace
in Figure 5d, where the middle node receives the explorer from the initiator and sends an explorer
to the last node in the line.
Using the different scenario exploration operations the user can easily validate the specification
interactively in the early phases of the design. But it is also possible to ask directly for specific
scenarios. For example, the following command asks directly for a trace where the protocol runs
to completion in a complete network of 4 nodes.
pred Finish {
Initiator.received = Initiator.adj
}
run MinimalRunComplete {
all n : Node | Node-n in n.adj
eventually Finish
} for exactly 4 Node, 19..19 steps
By default, all analysis commands are run with the bounded model-checking engine, exploring
traces up to depth 10 in the implicitly specified transition system. This default scope can be
changed by setting the scope for steps. In the above command, since we know a complete run
for this configuration takes 19 states, the scope of steps is set to be exactly that value to make
the analysis run faster. The bounded model-checking analysis always tries to return the shortest
instance or counter-example, so it will explore traces of increasing sizes in sequence, meaning that it
can take a long time to get to a size where the desired instance is possible. Further restrictions can
be imposed when searching for specific scenarios, for example, we can ask the Analyzer to produce
a trace where a particular sequence of events occurred, something not possible with TLA+ .
To check the correctness of the protocol for all configurations with up to 4 nodes we can use
the following command.
pred SpanningTree {
all n : Node - Initiator | Initiator in n.^parent
14
}
assert Correctness {
always (Finish implies SpanningTree)
}
check Correctness for 4 but 1.. steps
Here we used the special scope 1.. steps to trigger the complete model-checking analysis and
verify all possible traces allowed by the specification (likewise to TLC). With the symbolic model-
checker nuXmv selected as verification engine, this property is verified in a couple of seconds. For
all configurations with up to 5 nodes it takes around 1m, way faster than the 13m that TLC took to
verify the same set of configurations. For all configurations with up to 6 nodes, the Alloy 6 Analyzer
takes around 30m, while TLC did not even finish enumerating all possible configurations after 1h.
This efficiency is due to the sophisticated symbolic model-checking procedures implemented by
nuXmv to verify invariants, which are better suited for very non-deterministic systems such as this
one (both in terms of different configurations and different interleavings).
To verify termination, we need to impose weak fairness. Unlike with TLA+ , in Alloy 6 there is
no special keyword for that and the user must explicitly encode the required weak fairness.
pred Fairness {
eventually always some inbox implies always eventually Next
}
assert Termination {
Fairness implies eventually Finish
}
check Termination for 4 but 1.. steps
Unfortunately, the symbolic model-checking engines currently supported by Alloy 6 are all quite
slow at verifying this property. For networks with up to 4 nodes the best timing was around 20m
for complete model-checking, way slower than TLC for the same scope. For bigger scopes the
verification timed out at 1h.
5 Final remarks
As we have seen, the formal design of software with complex configurations is not only desirable
but already possible and cost-effective with the presented languages and tools. The semantics of
both Alloy and TLA+ are based on very simple mathematical concepts that most software engineers
should have learnt in their undergrad, and both provide automatic analysis tools that require little
expert input, so I believe they have the potential to be used at large.
Although the analysis of multiple configurations is not natively supported by TLA+ , it is rather
trivial to adapt a specification that checks a single configuration to one that checks multiple config-
urations at once. The main problem is that the TLA+ logic and the explicit model-checking tech-
nique of TLC are not best suited to specify and enumerate configurations with complex structure
and constraints. For designing software with simple configurations, such as concurrent algorithms
where only the number of nodes varies, TLA+ and TLC should be an excellent choice, even if TLC
limitations can sometimes be frustrating.
Alloy on the other hand excels at describing and analyzing complex configurations, and the sym-
bolic model-checking techniques implemented in the Alloy 6 Analyzer are already efficient enough
to verify safety properties for many complex configurations at once. The visualization and scenario
exploration operations implemented in the Analyzer also make Alloy substantially better for the
validation task in these contexts. In recent years we successfully applied Electrum and Alloy 6 in
the analysis of some larger case studies of software with complex configurations, for example the
Hybrid ERTMS/ETCS Level 3 railway signaling standard [6] and interactive systems described
with task models [7].
What about other formal software design frameworks? As we have mentioned in the intro-
duction, many are heavyweight formal methods, for example theorem provers, requiring a lot of
expertise input and thus only viable to apply in the most safety-critical applications. But even in
those applications, using frameworks like TLA+ or Alloy can help with debugging and validation in
15
the early phases of the design, making sure one only attempts to do the full proofs after having a
lot of confidence the design is correct. As for lightweight formal methods, the most popular model-
checkers – for example NuSMV, nuXmv, or SPIN – are too low-level to be used directly to describe
software with complex configurations, being better suited to be used as backend analysis tools for
higher-level languages, as is the case with Alloy 6. Among the higher-level formal specification
languages, perhaps the tools in the B method [1] ecosystem are the ones better suited to compete
with TLA+ and Alloy, in particular the ProB tool [14]. This tool already implements some of
the features of the Alloy Analyzer, namely simulation and graphical depiction of counter-examples,
and offers interfaces to both TLA+ and Alloy. Unfortunately, its specification language is far from
minimal and, in my opinion, not as ideal for abstract specification as that of Alloy or TLA+ .
There is still a lot of room for improvement, though. None of the tools was good enough
to check liveness properties for reasonable network sizes, so further research on improving the
scalability of the model-checking techniques is necessary. Also, one should consider extending
these languages to better support the quantitative requirements that are becoming pervasive in
modern software, for example in machine learning systems or security protocols. For Alloy we have
already done some preliminary steps in that direction [20].
References
[1] Jean-Raymond Abrial. The B-book: assigning programs to meanings. Cambridge University
Press, 1996.
[2] Richard Bird and Oege de Moor. Algebra of programming. Prentice-Hall, 1997.
[3] Julien Brunel, David Chemouil, Alcino Cunha, and Nuno Macedo. The Electrum Analyzer:
Model checking relational first-order temporal specifications. In Proceedings of the 33rd
ACM/IEEE International Conference on Automated Software Engineering, pages 884–887.
ACM, 2018.
[4] Julien Brunel, David Chemouil, Alcino Cunha, and Nuno Macedo. Simulation under arbitrary
temporal logic constraints. In Proceedings 5th Workshop on Formal Integrated Development
Environment, volume 310 of EPTCS, pages 63–69, 2019.
[5] Ernest J. H. Chang. Echo algorithms: Depth parallel operations on general graphs. IEEE
Transactions on Software Engineering, 8(4), 1982.
[6] Alcino Cunha and Nuno Macedo. Validating the hybrid ERTMS/ETCS level 3 concept with
Electrum. International Journal on Software Tools for Technology Transfer, 22(3):281–296,
2020.
[7] Alcino Cunha, Nuno Macedo, and Eunsuk Kang. Task model design and analysis with Alloy.
In Proceedings of the 9th International Conference on Rigorous State Based Mehtods, 2023.
To appear.
[8] Jonathan Edwards, Daniel Jackson, and Emina Torlak. A type system for object models. In
Proceedings of the 12th ACM SIGSOFT International Symposium on Foundations of Software
Engineering, pages 189–199. ACM, 2004.
[9] Marcelo F. Frias, Juan P. Galeotti, Carlos G. López Pombo, and Nazareno M. Aguirre.
DynAlloy: upgrading Alloy with actions. In Proceedings of the 27th International Conference
on Software Engineering, pages 442–451. IEEE, 2005.
[10] Daniel Jackson. Software Abstractions: Logic, Language, and Analysis. MIT Press, revised
edition, 2016.
[11] Leslie Lamport. The temporal logic of actions. ACM Transactions on Programming Languages
and Systems, 16(3), 1994.
16
[12] Leslie Lamport. Specifying Systems: The TLA+ Language and Tools for Hardware and Soft-
ware Engineers. Addison-Wesley, 2002.
[13] Leslie Lamport. If you’re not writing a program, don’t use a programming language. Bulletin
of EATCS, (125), 2018.
[14] Michael Leuschel and Michael Butler. ProB: an automated analysis toolset for the b method.
International Journal on Software Tools for Technology Transfer, 10:185–203, 2008.
[15] Nuno Macedo, Julien Brunel, David Chemouil, and Alcino Cunha. Pardinus: A temporal
relational model finder. Journal of Automated Reasoning, 66(4):861–904, 2022.
[16] Nuno Macedo, Julien Brunel, David Chemouil, Alcino Cunha, and Denis Kuperberg.
Lightweight specification and analysis of dynamic systems with rich configurations. In Proceed-
ings of the 2016 24th ACM SIGSOFT International Symposium on Foundations of Software
Engineering, pages 373–383. ACM, 2016.
[17] Joseph P. Near and Daniel Jackson. An imperative extension to Alloy. In Proceedings of the
2nd International Conference on Abstract State Machines, Alloy, B and Z, volume 5977 of
LNCS, pages 118–131. Springer, 2010.
[18] Chris Newcombe. Why Amazon chose TLA+. In Proceedings of the 4th International Con-
ference on Abstract State Machines, Alloy, B, TLA, VDM, and Z, pages 25–39. Springer,
2014.
[19] Amir Pnueli. The temporal logic of programs. In Proceedings of the 18th Annual Symposium
on Foundations of Computer Science, pages 46–57. IEEE, 1977.
[20] Pedro Silva, José N. Oliveira, Nuno Macedo, and Alcino Cunha. Quantitative relational
modelling with QAlloy. In Proceedings of the 30th ACM Joint European Software Engineer-
ing Conference and Symposium on the Foundations of Software Engineering, pages 885–896.
ACM, 2022.
17