Geometry of Voice Leading in Music
Geometry of Voice Leading in Music
Princeton University
May 13, 2005
2
determines the kind of efficient voice-leadings it can participate in. Thus, for the first
time, we can precisely specify the way in which harmony and counterpoint are related.
p = 69 + 12log2(f/440) (1)
This creates a linear pitch space in which middle C is 60, an octave has size 12, and a
semitone (the distance between adjacent keys on a piano keyboard) has size 1. To create
circular pitch-class space, we identify all points p and p+12, forming the quotient space
R/12Z. (Here R refers to the set of real numbers and Z to the group of integers; the
notation R/12Z refers to the circular quotient space, whose points are the orbits of 12Z as
it acts on R.) This creates numerical equivalents for the familiar pitch-class letter names:
C=0, Cs/Df=1, D=2, “D quarter-tone sharp”=2.5, and so on. Note that although we will
consider the most general case of a continuous pitch-class space, in musical situations
one is typically concerned with a lattice of discrete, equally-spaced points in this space,
corresponding to the familiar pitch-classes of Western equal-temperament.
Formally, a chord is a multiset of pitch-classes, i.e. a set in which duplicates are
allowed. We will denote unordered multisets using curly braces: the C major chord is {0,
4, 7}, and the F-major chord is {0, 5, 9}. The musical term transposition is synonymous
with the mathematical term translation, and corresponds to addition in R/12Z. Two
chords are transpositionally equivalent if they the same up to some translation in pitch-
class space. Thus the C-major chord and F-major chord are transpositionally equivalent,
since {0, 5, 9} = {7 + 5, 0 + 5, 4 + 5}. Symbolically, we write T5({0,4,7}) = {5, 9, 0}.
The musical term inversion is synonymous with the mathematical term reflection, and
3
corresponds to subtraction from a constant value in R/12Z. Two chords are inversionally
equivalent if they are the same up to some reflection in pitch-class space. Thus the C
major chord {0, 4, 7} is inversionally equivalent to the C minor chord {0, 3, 7} since {0,
3, 7} = {7 – 7, 7 – 4, 7 – 0}. We use Ix to refer to the reflection that sends 0 to x, writing
I7({0, 4, 7}) = {0, 3, 7}. Musically, transposition and inversion are significant because
they preserve an important aspect of the “quality” or “character” of a chord:
transpositionally-related chords sound extremely similar; inversionally-related chords,
somewhat less so.
A voice-leading between two chords {a1, a2, …, am} and {b1, b2, …, bn} is a
multiset of ordered pairs (ai, bj), such that every element of each chord is in some pair.
(Regular parentheses denote ordered lists.) A trivial voice-leading contains only pairs of
the form (x, x). We denote voice-leadings using vector notation A→B indicating that the
ith components of the vectors are associated by the voice-leading. Thus the voice-leading
(0, 0, 4, 7)→(11, 2, 5, 7) associates the root of the C major triad with both the third and
fifth of the G7 chord, while associating the third and fifth of the C major chord with the
seventh and root of the G7, respectively. This voice-leading can be interpreted either as a
non-bijective voice-leading between (0, 4, 7) and (11, 2, 5, 7) or as a bijective voice-
leading between (0, 0, 4, 7) and (11, 2, 5, 7).
Music theorists have proposed numerous ways of measuring the size of a voice-
leading. These measures are closely related to familiar mathematical norms, and include
“taxicab norm,” Euclidean norm, and a few exotic quasi-norms indigenous to music
theory (see the Materials and Methods). These proposals are at best approximations,
attempts to make explicit composers’ intuitions as embodied in Western musical practice.
For this reason we will not adopt any one method of measuring voice-leading size.
Instead, we will require only a normlike strict weak ordering of voice-leadings, satisfying
a few constraints that ensure it resembles a mathematical measure of “length” (see the
Materials and Methods). To date, every music-theoretical method of measuring voice-
leading size gives rise to a normlike strict weak ordering. It can be shown that for any
normlike strict weak ordering, there will be a minimal voice-leading between arbitrary
chords A→B that has no “voice-crossings”: that is, it is possible to move the elements of
4
A continuously to their counterpoints in B such that no two paths coincide other than at
the endpoints of the process (see the Materials and Methods). Since Western composers
have traditionally avoided “voice-crossing” (Fig. 1[a-c]), this result suggests that
normlike strict weak orderings are at least consistent with observed features of Western
musical practice (see the Materials and Methods). Furthermore, it enables the use of the
standard computer-science technique of “dynamic programming” to identify, in
polynomial time, a minimal voice-leading between arbitrary chords (see the Materials
and Methods).
5
space of two-note musical chords is a Möbius strip, a “square” whose left “edge” is
identified, modulo a half-twist, with its right. The orbifold is singular at its circular
boundary, which acts as a mirror (21).
Any voice-leading between dyads can be uniquely associated with a path on
Figure 2. A metric allows us to measure the length of these paths, while a normlike strict
weak ordering allows us to compare two paths without quantifying their “length.” The
paths corresponding to voice-leadings are the images of line-segments in the parent space
Tn. They are either line-segments in the orbifold, or “reflected” line-segments that
“bounce off” the orbifold’s mirror boundary. For example, on Figure 2, the voice-leading
(0, 1)→(1, 0) corresponds to the path that begins at (0, 1), moves in a straight line to (.5,
.5), and gets reflected back along the same line-segment to (0, 1) (Fig. 2). (To see why,
imagine each pitch-class moving continuously to its destination, starting and ending at
the same time.) “Reflected” voice-leadings contain voice-crossings, since the edge
contains all and only those chords with duplicate pitch-classes. The “no-crossing”
principle therefore asserts that there will be a minimal voice-leading between any two
points that does not touch the orbifold’s singular boundary.
Generalizing Figure 2 to higher dimensions is straightforward (see the Materials
and Methods). Given a Euclidean metric, the orbifolds Tn/Sn are simplicial prisms whose
faces are identified, modulo the orthogonal transformation that cyclically permutes the
vertices of one of the faces. (Figure 2 is a 2-dimensional prism [square] whose 1-
dimensional “faces” [left and right “edges”] have been identified modulo the reflection
that exchanges their vertices [a reflection, or 180° “twist” in the third dimension]. Three-
note chords lie on a three-dimensional prism whose faces are equilateral triangles. One
face is rotated 120° before the faces are identified; the resulting figure is the bounded
interior of a twisted triangular 2-torus.) The singular boundary of the orbifold acts as a
mirror and contains chords with duplicate pitch-classes. Chords that divide the octave
into n equal parts are at the center of the orbifold, while chords containing only one pitch-
class lie along its one-dimensional edge (Fig. 2). Voice-leadings represented by line-
segments parallel to the orbifold’s one-dimensional edge are not independent: each voice
moves in the same direction by the same amount. Voice-leadings perpendicular to the
6
boundary are independent, and preserve the sum of a chord’s pitch-classes. These above
descriptions assume a Euclidean metric; closely analogous statements hold for the
orbifolds with other metrics.
7
A bijective voice-leading from a chord to itself (a permutational voice-leading)
acts as a permutation of that chord’s elements. A chord that has duplicate pitch-classes
will be permutationally invariant (P-invariant), since there will be some nontrivial
permutation of its elements that yields a trivial voice-leading. P-invariant chords lie on
the boundary (or “singular locus”) of the voice-leading orbifold. Chords that are close to
the orbifold’s boundary can be described as nearly P-invariant, since they will have
efficient permutational voice-leadings that reflect off the nearby boundary (Fig. 2).
These voice-leadings are non-minimal, since they are larger than the trivial voice-leading;
they contain voice-crossings, since they touch the orbifold’s boundary.
Nearly P-invariant chords include “chromatic clusters” such as {0, 1}, {0, 1, 2},
and {4, 5, 6, 7}. Such chords, which are considered to be extremely dissonant, are well-
suited for static music in which voices move by small distances within an unchanging
harmonic context (Fig. 1[d]) (23). This sort of “permutational” voice-leading is
characteristic of much late-twentieth century non-tonal composition, particularly the
works of Gyorgy Ligeti (24). Efficient “permutational” voice-leadings can also be used to
generate independent, relatively efficient voice-leadings A→Tx(A), where x is close to
zero.
Transposition by x semitones is an automorphism of the voice-leading orbifold
that preserves the “size” of voice-leadings according to any normlike strict weak
ordering: for any normlike strict weak ordering, the voice-leading (a1, a2, …, an)→(b1, b2,
…, bn) is the same size as (a1 + x, a2 + x, …, an + x)→(b1 + x, b2 + x, …, bn + x). A
transpositionally invariant (T-invariant) chord is a fixed point of one of these
automorphisms; for n-note chords, such fixed points exist only when nx is congruent to 0,
mod 12Z. Chords lying close to these T-invariant chords can be described as nearly T-
invariant, since there will be multiple transpositions of such chords located near any
single fixed point. These transpositionally-related chords can therefore be connected by
efficient voice-leadings.
On Figure 2, the T6-related perfect fifths {4, 11} and {5, 10} lie close to the same
T6-invariant tritone, {4.5, 10.5}. This accounts for the small voice-leading (4, 11)→(5,
10). By transposing the second chord up by semitone, we can obtain a fairly efficient
8
voice-leading (4, 11)→(4, 9). This voice-leading occurs between the top and lower-
middle voices of the first two chords in Fig. 1(b); analogous voice-leadings connect the
top and lower-middle voices of the remaining chords in the progression. (The other two
voices are linked by the smooth voice-leading between T5-related tritones, which appears
on Figure 2 as rightward motion.) Similarly, on the three-dimensional voice-leading
orbifold the T4-related C and E major triads lie close to the same T4-invariant augmented
triad; for this reason, they are connected by a small voice-leading (0, 4, 7)→(11, 4, 8).
Again, by transposing the second chord up by semitone, this voice-leading generates a
fairly efficient voice-leading between T5-related major triads; the result is the first voice-
leading shown in Figure 1(a). The remaining voice-leadings in Figure 1(a) can all be
derived from this one by transposition and time-reversal.
T-invariance is due to the evenness with which a chord’s elements are distributed
in pitch-class space. A T-invariant chord either divides the octave into equal parts, and
occupies the center of the orbifold, or is the union of equally-sized chords that themselves
divide the octave evenly (25). (The union of differently-sized chords that evenly divide
the octave is not in general T-invariant.) Likewise, a near T-invariant chord divides the
octave into nearly-equal parts, or is the union of n-note chords that do so. In general, the
more evenly-spaced a chord, the closer it will be do the center of the orbifold, and the
smaller will be its bijective voice-leadings to its T-equivalent forms (see the Materials
and Methods). Indeed, it can be shown that the chord which divides pitch-class space
into n equal parts has the smallest possible minimal bijective voice-leading to all of its
transpositions: for all n-note chords A, the minimal bijective voice-leading between A
and Tx(A) can be no smaller than the minimal bijective voice-leading between E and
Tx(E), where E divides pitch-class space into n equal parts (see the Materials and
Methods). A corollary covers the covers the discrete case of a finite evenly-tempered
pitch-class space (see the Materials and Methods).
This fact has a singularly important musical consequence: “acoustically
consonant” chords tend to be nearly T-invariant. Acoustic consonance is incompletely
understood; however, most music theorists agree that chords approximating the first few
consecutive pitch-classes of the harmonic series will be consonant when played with
9
harmonic tones (20). Remarkably, the structure of the harmonic series ensures that such
chords will divide the octave into nearly-even parts (Table 1). The relation between
acoustic-consonance and near-evenness has had an enormous impact on the development
of traditional Western music. The near-evenness of traditional Western harmonic
materials implies that these chords are clustered near the center of the voice-leading
orbifold; for this reason, there exist transpositions of these chords that can be linked by
efficient, independent voice-leadings. This is true whether the chords are
transpositionally equivalent (Fig. 1[a-b]) or transpositionally distinct (Fig. 1[c]).
Traditional tonal counterpoint, in its essence, consists in the exploitation of these efficient
voice-leadings. They exist because of the near-evenness of the underlying sonorities, a
property which is in turn attributable to classical composers’ interest in acoustic
consonance.
Finally, inversions (or reflections) are automorphisms of the voice-leading
orbifold that again preserve the “size” of voice-leadings according to any normlike strict
weak ordering. Inversionally invariant (I-invariant) chords are fixed points of some
reflection; such fixed points exist for any Ix. A chord that lies near an I-invariant chord is
nearly I-invariant, since there will be two I-related chords lying close to the same I-
invariant chord; this again permits small voice-leadings between them. For example, the
Fs “half-diminished seventh” chord {6, 9, 0, 4} and the F “dominant seventh” chord {5,
9, 0, 3} lie near the same (I-invariant) chord {5.5, 0, 0, 3.5}: this permits the efficient
voice-leading (6, 9, 0, 4)→(5, 9, 0, 3), shown in Fig. 1(c). I-invariant chords can be
highly consonant, like (0, 3, 7, 10), or highly dissonant, like (0, 1, 2, 3). However,
composers in most Western styles have considered I-invariant chord pairs to be “similar.”
Consequently, they have frequently exploited efficient voice-leadings between
inversionally-related chords.
Thus we see that the geometrical properties of the orbifolds Tn/Sn give rise to a
wide range of related musical practices, each of which exploits different symmetries that
a chord might have. Our discussion suggests multiple avenues of further music-
theoretical inquiry. First, one could investigate in detail the ways in which Western
composers, performers, and improvisers have exploited the three symmetries that can
10
produce small voice-leading: for example, Schubert was fond of the near T4-invariance of
the major triad (27), Wagner and Debussy exploited the near i-invariance of the
“dominant seventh chord” (Fig. 1[c]), while contemporary jazz harmony frequently
exploits the near t-symmetry of the perfect fifth (Fig. 1[b], top and lower-middle voice).
Second, one could investigate how the mathematical properties described in this paper
have influenced the broader course of music history—examining how the concern for
efficient voice-leading interacted with, and presumably helped motivate, the increasing
“chromaticism” of nineteenth-century music. Third, one could investigate whether
distances in the voice-leading orbifold correlate with perceptual judgments of similarity
among chords—a topic of considerable recent theoretical interest (28). Finally, a clear
understanding of the relation between chord structure and voice-leading may suggest new
techniques to contemporary composers.
11
NOTES
1. For a glossary of mathematical and musical terms and abbreviations used in this paper,
see Tables S1 and S2.
2. C. Masson, Nouveau Traité des Regles pour la Composition de la Musique (Da Capo,
New York, 1967 [1694]).
3. O. Hostinský, Die Lehre von den musikalischen Klangen (H. Dominicus, Prague,
1879).
9. J. Roeder. A Theory of Voice Leading for Atonal Music. Ph.D. thesis, Yale University
(1984).
12
17. J. Douthett, P. Steinbach, Peter. Journal of Music Theory 42, 241(1998).
20. I. Satake. Proceedings of the National Academy of Sciences 42, 359 (1956).
26. W. Sethares. Tuning, Timbre, Spectrum, Scale (Springer, New York, 1998).
13
a) a common classical upper-voice I-IV-I-V-I pattern
& ww ww ww ww ww
w w w w w
(0, 4, 7) → (0, 5, 9) → (0, 4, 7) → (11, 2, 7) → (0, 4, 7)
I IV I V I
w w b wwww w
? # www
b) a common jazz-piano “left-hand” voice-leading pattern
n www b www
(6, 11, 0, 4) → (5, 9, 11, 4) → (4, 9, 10, 2) → (3, 7, 9, 2)
D7 G7 C7 F7
w b www
c) Wagner, Parsifal (simplified) and Debussy, Prelude to the Afternoon of a Faun
b w www www
& ww
(11, 0, 1) → (0, 1, 11) → (11, 1, 0)
Figure 2. The orbifold T2/S2, drawn using a Euclidean metric Labelled points
in the space correspond to equal-tempered dyads; the symbols “t” and “e” refer
to 10 and 11, respectively. The left “edge” is identified, with a half-twist, with
the right. The two voice-leadings (0, 1)→(1, 0) and (4, 11)→(5, 10) are shown
on the graph; the first of these is reflected off the figureʼs mirror boundary.
Number of The equal-tempered chord Other chords providing reasonably
Notes providing the best approximation good approximations to the lowest
to the lowest pitch-classes of the pitch-classes of the harmonic series
harmonic series
2 (dyad) fifth (0, 7)
3 (triad) major (0, 4, 7) diminished (0, 3, 6)
minor (0, 3, 7)
augmented (0, 4, 8)
4 (seventh dominant (0, 4, 7, 10) diminished (0, 3, 6, 9)
chords) half-diminished (0, 3, 6, 10)
minor (0, 3, 7, 10)
major (0, 4, 7, 11)
5 (ninth dominant ninth (0, 2, 4, 7, 10) pentatonic (0, 2, 4, 7, 9)
chords)
7 (scales) melodic minor (0, 2, 4, 6, 7, 9, 10) major (0, 2, 4, 5, 7, 9, 11)
(ascending harmonic (0, 2, 3, 6, 7, 9, 10)
form) minor
Table 1. Familiar sonorities used in Western music. The sonorities on the left provide
the best equal-tempered approximations to the first n pitch-classes of the harmonic series.
The commonly-used sonorities on the right lie also approximate the first n pitch-classes
of the harmonic series. All sonorities divide pitch-class space fairly evenly.
MATERIALS AND METHODS
TABLE OF CONTENTS
1. Comparing voice-leadings S1
2. Minimal voice-leadings and voice-crossings S4
3. A polynomial-time algorithm for finding a minimum
voice-leading between two chords S8
4. Derivation of the voice-leading orbifolds S10
5. Efficient voice-leading and symmetry S12
6. Evenness and transpositional invariance S16
S1
Let > be a strict weak order of multisets of nonnegative reals. We will say that
the relation > is normlike if and only if it satisfies two constraints.
{x1, x2, …, xm, c} > {y1, y2, …, yn, c} implies {x1, x2, …, xn} > {y1, y2, …, yn} (Recursion)
{x1 + i, x2, …, xn} ≥ {x1, x2 + i, …, xn} ≥ {x1, x2, …, xn}, for x1 > x2, i > 0 (Distribution)
(NB: since multisets are unordered the numerical subscripts do not have ordinal
significance: x1 is no more “first” than x2 or xn.) The recursion constraint mandates a
predictable relationship between the size of a multiset and the size of its sub-multisets.
The distribution constraint’s first inequality requires that if X is an n-element multiset
whose values sum to x, then {x, 0, 0, …, 0} ≥ X ≥ {x/n, x/n, …, x/n}. Thus, x semitones
of motion in a single “voice” yields at least as large a voice-leading as x semitones of
motion distributed over multiple voices. As we will see below, this constraint is closely
related to the triangle inequality. The distribution constraint’s second inequality requires
that reducing the size of an element in a displacement multiset not make that multiset
larger. If a normlike strict weak order strictly satisfies both of the distribution
constraint’s inequalities, we will say that it strictly satisfies the distribution constraint.
At present, every music-theoretical method of measuring voice-leading size
produces a normlike strict weak order of multisets of non-negative reals. All but one
strictly satisfy the distribution constraint.
S2
C. “Parsimony.” Parsimony generalizes a notion introduced by Richard Cohn and
developed by Jack Douthett and Peter Steinbach (S5, S6). Given two voice-
leadings, α and β, α is smaller (or “more parsimonious”) than β iff there exists
some real number j such that
1) for all real numbers i > j, i appears the same number of times in the
displacement multisets associated with α and β; and
2) j appears fewer times in the displacement multiset of α than β.
S4
THEOREM 1. Let A and B be any two chords, and let our measure voice-
leading size be a strict weak order satisfying the distribution constraint. There will
exist a minimal voice-leading from A to B, (a1, a2, …, an)→(b1, b2, …, bn), that has
no “voice-crossings” in pitch-class space. That is, there will exist a set of
continuous functions fn(t) such that fn(0) = an, fn(1) = bn, and fm(t) ≠ fn(t), for all m
≠ n, and all t such that 0 < t < 1. Furthermore, if our order strictly satisfies the
distribution constraint, then every minimal voice-leading between A and B will be
crossing-free.
S5
Figure S3(c) shows a third possibility. m + n > x, since otherwise there would be
no crossing. This implies x – m < n and x – n < m. Therefore {m, n} ≥ {x – m, x – n},
and the uncrossed voice-leading is no larger the crossed voice-leading. Again, if the
strict weak order strictly satisfies the distribution constraint then the uncrossed voice-
leading is smaller.
The remaining cases are closely analogous to those already considered, and are
left for the interested reader to verify. It remains to be shown that we can follow the
above procedures without creating any new voice-crossings. This is readily seen from
Figure S4. Without loss of generality, we can choose points b1 and b2 in Figure S4 to be
adjacent. We connect every note in the source chord to its destination by a path that has
no unnecessary crossings, as in Figure S4. Figure S4(a) features the crossing (a1,
a2)→(b2, b1), as well as two additional types of voice-crossing: c1→d1, which crosses the
line a1→b2, and c2→d2, which crosses both a1→b2 and a2→b1. Figure S4(b), which
removes the crossing (a1, a2)→(b2, b1), shows that the remaining crossings c1→d1 and
c2→d2 are unaffected. Removing the crossing therefore reduces the total number of
voice-crossings in the voice-leading. The crossings shown in Figure S4, along with those
that can be obtained from this figure by reflection, exhaust the relevant geometrical
possibilities. We conclude that it is possible remove a voice-leading’s crossings without
making the voice-leading larger. If our normlike strict weak order strictly obeys the
distribution constraint, then removing voice-crossings will always make the voice-leading
smaller.
Theorem 1 is significant because it ties an important musical notion, “voice-
crossing,” to an important mathematical one: the triangle inequality, as represented by its
close cousin, the distribution constraint. It is widely accepted that avoidance of “voice-
crossings” in pitch space is a feature of traditional Western compositional practice (S7).
Theorem 1, which can easily be adapted to cover the case of voice-leadings in non-
circular pitch space, shows that normlike strict weak orderings are compatible with this
feature of classical practice. Moreover, it is easy to show that if a method of comparing
voice-leading size violates the distribution constraint, then there will be at least one
“crossed” voice-leading (in either pitch or pitch-class space) that is preferred to its
S6
uncrossed alternative. Thus the distribution constraint and the principle of avoiding
voice-crossings are equivalent within the limits of the formalism we have developed.
At the same time, the distribution constraint is closely related to the triangle
inequality. This allows us to use the minimal voice-leading between two chords to define
a “distance” between them, thereby underwriting the geometrical approach of the present
paper. Again, Theorem 1 is interesting precisely because it shows that our reference to
the geometrical concept of “distance” requires that we not prefer crossed voice-leadings
to their uncrossed alternatives. Consequently, were classical composers to have favored
voice-crossings, we would not be able to able to speak of the “distance” between chords
in the relatively straightforward way that we do here. We would be constrained to talk
only about the affine structure of musical chords—roughly, those non-metric properties
that depend only on the existence of “straight lines” in the space.
We conclude this section with a brief sketch of a proof that the distribution
constraint is equivalent to the triangle inequality. Let A and C be chords. The triangle
inequality requires that a bijective voice-leading A→C be no larger than combined length
of any pair of bijective voice-leadings A→B and B→C, that takes A to C by way of B in
such a way as to preserve the mappings of the “direct” voice-leading A→C. It is
straightforward to identify the displacement multiset associated with A→B→C when A,
B, and C are collinear: one simply adds the elements of the displacement multisets
associated with A→B and B→C so as to be faithful to the musical voices’ motions. The
displacement multiset associated with non-collinear A→B→C, if defined, is simply the
displacement multiset associated with A→B→D, with A, B, and D collinear and B→C
the same size as B→D. A normlike strict weak ordering does not ensure that there is a
displacement multiset associated with all paths A→B→C; but it does ensure if there is, it
is smaller than that associated with the direct voice-leading A→C.
To see why, suppose there is some crossed voice-leading between chords A and C
that is preferred to the uncrossed voice-leading A→C. There will be a pair of voice-
leadings A→B→C that has the same combined displacement multiset as the crossed
voice-leading but which preserves the mappings of the “direct” voice-leading A→C.
(Here B is the point where the two voices cross as they move linearly from notes in A to
S7
their counterparts in C.) Since the crossed voice-leading is preferred, the combined
voice-leadings A→B→C are smaller than A→C, which violates the triangle inequality.
Conversely, suppose there is a triangle ABC such that the combined voice-leadings
A→B→C are smaller than the “direct” voice-leading A→C. There is a voice-leading
A→D with the same displacement multiset as A→B→C. Since A→B→C form two legs
of a triangle, it is easy to show that the preference for A→D over A→C must violate the
distribution constraint.
S8
form [a1, …, ai]→[b1, … bj] will be the voice-leading that adds the pair (ai, bj) to the
smallest voice-leading of the form [a1, …, ai-1]→[b1 … bj], [a1, …, ai]→[b1 … bj-1], or [a1,
…, ai-1]→[b1 … bj-1].
Thus, once we have fixed the pair (a1, b1) we can recursively compute the minimal
voice-leading between A and B that contains that pair. We do this by creating a matrix
whose cells ei, j record the size of the minimal voice-leading of the form [a1, …, ai]→[b1,
… bj]. It is trivial to fill in the first row and column of the matrix; from there, we can
proceed to fill in the rest. At each step, we need only consider the voice-leadings in a
cell’s upper, left, and upper-left neighbors.
Figure S5 illustrates the technique, identifying the smallest voice-leading between
the C and E major-seventh chords, {4, 7, 11, 0} and {4, 8, 11, 3}, such that the voice-
leading contains the pair (4, 4). In constructing this matrix we have used “smoothness”
(or “taxicab norm”) to measure the voice-leading size. The voice-leading in the bottom-
right cell is the minimal voice-leading between the two chords that contains (4, 4). To
remove this last restriction, we would need to repeat the calculation three more times,
each time cyclically permuting the order of one of the chords so as to fix a different
initial pair. As it happens, however, the voice-leading shown in Figure S5 is the
minimum voice-leading between the respective chords. This follows from the fact that
the voice-leading in the top-left cell (4→4) contributes nothing to the overall size of the
voice-leading; we can therefore add this mapping to any voice-leading without increasing
its size according to the L1 norm.
Figure S5 includes in each cell both the numerical size of the voice-leading and
the voice-leading itself. With the L1 norm (“smoothness”) this is unnecessary: we need to
keep track of the size, but not the voice-leading. To determine the value of cell ei,j we can
simply add the distance between the pair (ai, bj) to the minimum value in the cells ei-1, j
ei, j-1, and ei-1, j-1. (With the Euclidean metric we can calculate squared distance in this
way, taking the square-root just before output.) Having filled in the matrix, we can
recover the minimal voice-leading by “tracing back” all paths that move from the bottom-
right cell to the top left, moving only north, west, and northwest, such that the size of the
S9
voice-leading decreases as much as possible with each step. The cells in boldface
indicate the path that such a traceback algorithm would take.
Due to the circular structure of pitch-class space, the voice-leading in the lower
right-hand corner of the matrix counts the pair (a1, b1) = (am+1, bn+1) twice; this can easily
be corrected prior to output.
Finally, note that need only consider n distinct possibilities to find a minimal
bijective voice-leading A→B. Let (a0, a1, … , an-1) order the elements of chord A based on
ascending distance from arbitrarily-chosen element a0. Similarly for (b0, b1, … , bn-1). By
Theorem 1, there will be a minimal bijective voice-leading between A and B of the form
(a0, a1, … , an-1)→(bc, bc+1, … , bc+n-1), where c is an integer and the subscript arithmetic is
reduced mod n.
S10
space S is a region R of S, such S is the union of the regions gR, for all g ⊂ Γ, and such
that the intersection of any two regions gR and hR, for g ≠ h, has no interior.) By
identifying the appropriate boundary points of this fundamental domain, we will obtain
the orbifold Rn/(Sn × 12Zn).
We first describe a fundamental domain of Sn in Rn. In this region, no two
distinct points (x1, x2, …, xn) and (y1, y2, …, yn) have coordinates that are equivalent
under some permutation: that is, there is no σ(n) such that (x1, x2, …, xn) = (yσ(1), yσ(2),…,
yσ(n)), where σ(n) is some permutation of the integers from 1 to n. We can create such a
region simply by requiring that a point’s coordinates be in nondescending order: i.e.
considering all points (x1, x2, …, xn) such that x1 ≤ x2 ≤ … ≤ xn. We can incorporate the
12Zn action by requiring that xn ≤ x1 + 12, and 0 ≤ Σnxn≤ 12. In Euclidean space, the
resulting fundamental domain is a right hyperprism whose faces are n-1 dimensional
simplexes. To see why, observe that
1. The n inequalities x1 ≤ x2 ≤ … ≤ xn ≤ x1 + 12 define an n-1 simplex in every
plane Σnxn = n.
2. Addition by (c, c, …, c) sends the simplex in the plane Σnxn = n to the simplex in
the plane Σnxn = n + cn.
3. The planes Σnxn = n are perpendicular to the vector (1, 1, …, 1).
The vector (1, 1, …, 1) points in the direction of the “height” coordinate of the prism; the
prism’s “faces” lie in planes perpendicular to the vector (1, 1, …, 1) and therefore contain
chords whose pitch-classes sum to the same value.
Our construction of the fundamental domain ensures that no two points on a
single plane Σnxn = n can represent the same chord. However, the planes do contain
S11
is an orthogonal transformation that is an automorphism of the prism: it is a rotation
when the prism has an odd number of dimensions, and a rotation-plus-reflection
otherwise. O acts so as to cyclically permute the vertices of the simplex in each plane
Σnxn = n.
It remains to be determined how the two simplicial faces of the prism are to be
identified. We cannot identify them in the obvious way, since this would identify point
(x1, x2, …, xn) on the Σnxn = 0 face of the prism with the transpositionally-distinct chord (x1
+ 12/n, x2 + 12/n, …, xn + 12/n) on the Σnxn = 12 face. Notice, however, that
(x2, x3, …, xn, x1 + 12) represents the same chord as (x1, x2, …, xn). We therefore need to
identify (x1, x2, …, xn) with O(x1 + 12/n, x2 + 12/n, …, xn + 12/n) = (x2, x3, …, xn, x1 +
12). Colloquially, we apply the transformation O to the Σnxn = 12 face before “gluing the
two faces together.” This identification transforms the prism’s “height coordinate” into a
circle: in moving parallel to the vector (1, 1, …, 1) we pass through all and only the
transpositions of a given chord, returning eventually to our starting point. Thus we can
describe the orbifold (R/12Z)n/Sn as the product of a n-1 simplex with a circle, modulo
the action that rotates the circle by 360/n degrees while applying the transformation O to
the simplex.
5. Efficient voice-leading and symmetry. Let A be an n-note chord and let (a1, a2, …,
an) be an arbitrary ordering of its elements. The symbol σ(a1, a2, …, an) will refer to the
ordering (aσ(1), aσ(2),…, aσ(n)), where σ(n) is some permutation of the integers from 1 to n.
We will use the notation A→σ(A) to refer to any voice-leading A→A that can be written
An arbitrary n-note chord S will be invariant under σ (or σ-invariant) if the chord’s
elements can be labeled so that si = sσ(i) for all i ≤ n.
S12
In what follows, we will use the variable O to refer to a specific permutation σ,
transposition Tx, or inversion Iy. We will say that an n-note chord S is invariant under O
if there is some voice-leading S→O(S) that is trivial. We will generally assume that O
itself is non-trivial: that is, there is at least one chord that is not invariant under O. Thus
we will not be considering the trivial permutation σ(n) = n or the trivial transposition
T0(x) = x.
It is intuitively obvious that the size of a voice-leading A→S, where S is invariant
under some O, sets an upper bound on the size of the minimal voice-leading A→O(A).
This is because we can express the voice-leading A→O(A) as the composition of two
equally-sized voice-leadings A→S and S→O(A). (For any A→S, we can find an equally
large S→O(A), since S is invariant under O and since a normlike strict weak order is
insensitive to the “direction” of the voice-leading.) Write the displacement multiset
corresponding to A→S as {d1, d2, …, dn}. We can conclude that the minimal voice-
leading A→O(A) can have a displacement multiset no larger than {2d1, 2d2, …, 2dn}.
Thus as the size of the voice-leading A→S goes to zero, the minimal voice-leading
A→O(A) must also go to zero.
The converse, however, is less obvious. Suppose we have some bijective voice-
leading A→O(A). Does the size of A→O(A) set an upper bound on the size of the
minimal voice-leading A→S, where S is O-invariant?
Yes, assuming such an S exists. The following theorem uses the size of A→O(A)
to limit the size of A→S, showing that as A→O(A) vanishes so must A→S. Since the
result is proven for any normlike strict weak order, it does not set a very tight (or
interesting) limit on the voice-leading A→S. However, it does establish the general
theoretical point that the size of A→O(A) is dependent on that of A→S.
Lemma 2.1. Let A be a chord with n elements, let x be some element of R/12Z
such that nx is congruent to 0, mod 12Z, and let A→σ(A) be a bijective voice-
leading that acts as a cyclical permutation of A’s elements. Label the pitch-
classes of A so that the voice-leading A→σ(A) can be written
Proof. The value di = |x + (ai+1 – ai)|12Ζ measures how close the interval ai+1 – ai is to –x.
As Figure S8 shows, we need to move at most n/2 pitch-classes by |x + (ai+1 – ai)|12Ζ
semitones in order to make a given interval ai+1 – ai equal to –x. We can do so,
furthermore, without disturbing any of the other di, for i < n-1. Only the intervals ai+1 – ai
and dn-1 = W need be disturbed. Since the voice-leading acts as a circular permutation,
and since nx is congruent to 0, mod 12Z, we need iterate this procedure only n–1 times in
order to obtain a set that is invariant under Tx or σ (if x = 0): once we set n–1 of the
intervals equal to –x, the final “wraparound” interval—labeled W on Figure S8—will
also be equal to –x. Note that since our choice of a1 is arbitrary, we can choose W so as
to minimize the resulting voice-leading A→S.
with displacement multiset {d0, d1, …, dn-1} (subscript arithmetic is mod n). We
can therefore find an S such that S is invariant under Ix and the voice-leading
A→S has displacement multiset no larger than
S14
voice-leading associates ac/2 with x – ac/2, and i ≠ c/2, in which case the voice-leading
associates ai with x – ac-i and ac-i with x – ai.
Case 1. i = c/2. Our voice-leading associates ai with x – ai; the distance between
these two points is |x – 2ai|12Ζ. Now consider the two minimal-length linear paths in pitch-
class space: the first from ai to x – ai and the second its retrograde, from x – ai to ai.
These paths are reflection symmetrical under Ix: every point ai + ε along the path ai→(x –
ai) is mapped by Ix to the point x – (ai + ε) along the path (x – ai)→ai. Therefore, the
midpoint af = x – af is fixed by the reflection. Consequently, we can move ai by |x –
2ai|12Ζ/2 semitones to obtain a pitch-class that is invariant under Ix.
Case 2. i ≠ c/2. Let j = c – i. Our voice-leading associates ai with x – aj and aj
with x – ai. Both ai and aj are mapped to pitch-classes |x – (ai + aj)|12Ζ semitones away.
Consider the two minimal linear paths ai→x – aj and aj→x – ai. If we reverse the
direction of the second path, we obtain two equal-length paths ai→x – aj and x – ai→aj
that are reflection-symmetrical under Ix: every point ai + ε along the path from ai→x – aj
is mapped to the point x – (ai + ε) along the path x – ai→aj. The points halfway along
these paths, af and x – af, are related by Ix. Therefore, we can move each pitch-class by
|x – (ai + aj)|12Ζ/2 semitones to obtain a pair that is invariant under Ix.
The term n/2d appears once for even n, twice for odd n.
S15
Case 1. O is a permutation σ. Since any permutation can be decomposed into
cycles, we simply apply Lemma 2.1 to obtain a voice-leading A→S that is no larger than
{d, d, 2d, 2d, 3d, 3d, …, n/2d, 0}, with S invariant under σ.
Case 2. O is a nonzero transposition Tx. By Theorem 1, there exists a crossing-
free voice-leading A→Tx(A) whose displacement multiset consists of values less than d.
Any crossing-free voice-leading can be decomposed into cycles of the form:
Thus we can again apply Lemma 2.1 to obtain the desired voice-leading.
Case 3. O is an inversion Ix. By Theorem 1, there exists a crossing-free voice-
leading A→Ix(A) whose displacement multiset consists of values less than d. By Lemma
2.2, there exists a voice-leading A→S, such that S is invariant under Ix, and with
displacement multiset less than or equal to {d/2, d/2, …, d/2}. By the distribution
constraint, this multiset is less than or equal to {d, d, 2d, 2d, 3d, 3d, …, n/2d, 0}.
S16
THEOREM 3. Let A be any multiset of cardinality n. For all x, the minimal
bijective voice-leading between A and Tx(A) can be no smaller than the minimal
bijective voice-leading between E and Tx(E), where E divides pitch-class space
into n equal parts.
(e1, e2, …, en)→(e1 + c, e2 + c, … en + c), where c is any real number ≡12Z/n x (5)
(NB: c is congruent to x mod 12Z/n, not mod 12Z.) Choose c so that |c| is as small as
possible. The displacement multiset corresponding to this voice-leading is {|c|, |c|, … ,
|c|}. The sum of the elements of this multiset is n|c|, where n|c| is the smallest positive
real number such that nc ≡12Z nx. By the distribution constraint, this multiset is as small
as any n-note multiset with the same or greater sum.
Now consider any bijective voice-leading between representatives of two n-note
transpositionally-equivalent chords A and Tx(A). Let ΣA refer to the sum of the
components of A. Therefore,
S17
Σ(Tx(A) – A) ≡12Z nx (6)
The real number Σ(Tx(A) – A) is the sum of signed quantities; the sum of the absolute
values of these quantities must therefore be greater than or equal to n|c|, where n|c| is the
smallest positive number such that nc ≡12Z nx. Thus the elements of the displacement
multiset associated with the voice-leading A→Tx(A) sum to at least n|c|. We conclude
that this voice-leading can be no smaller than the minimal voice-leading between En and
Tx(E).
There is a useful corollary to Theorem 3 that applies in the discrete case.
COROLLARY. Let Ek (the “chromatic scale”) divide pitch-class space into k >
n equal parts, let A be any n-note subset of Ek, and let M be the “maximally even”
n-note subset of Ek (S8). Then, for any integer i, the minimal bijective voice-
leading between A and T12i/k(A) can be no smaller than the minimal bijective
voice-leading between M and T12i/k(M).
The proof follows the same basic outlines as the proof of Theorem 3. We rely on the fact
that M divides any number of octaves into nearly even parts: given M = (m0, m1, …, mn-
1 ), and some constant integer c, the distances |mc+i – mi|12Ζ (subscript arithmetic mod n)
come in “consecutive integer sizes” when measured in units of 12/k (S8). That is, for
every integer c there exists an integer j, such that the distances |mc+i – mi|12Ζ are equal to
12j/k and (12j+1)/k. This allows us to find a voice-leading M→T12i/k(M) is small as
possible for n-note subsets of Ek. As before, we use the “cyclical” component of the
voice-leading mi→mc+i to neutralize the “transpositional” component of the voice-leading
mi→mi + x.
Now for the formalities. By the argument given above, the minimal voice-leading
A→T12i/k(A) has a displacement multiset whose sum is at least n|c|, where n|c| is the
smallest positive number such that nc ≡12Z 12in/k. What needs to be shown is that there is
a voice-leading M→T12i/k(M), with a displacement multiset summing to n|c|, whose
values are as evenly distributed as possible. Since our voice-leadings are required to
connect subsets of Ek, we can establish maximally-even distribution by showing that the
S18
values of the displacement multiset take on just two distinct values: 12r/k and 12(r+1)/k,
where r is some nonnegative integer.
Let (m0, m1, … mn-1) order the elements of M in ascending numerical order; form
∞
the infinite sequence S = {m(j mod n) + 12j/12}j=-∞. (Again, “x” refers to the greatest
integer ≤ x.) S consists of all of the elements of R congruent mod 12Z to elements of M.
This sequence is ordered in ascending numerical order and indexed such that S-1 = mn-1 –
12, S0 = m0, S1 = m1, and so on. The voice-leadings
with elements summing to nc, where n|c| is the smallest positive number such that nc ≡12Z
nx. When x and Sa+i – mi are both integer multiples of 12/k, the values of this n-tuple are
either constant or can be expressed in the form 12r/k and 12(r+1)/k, where r is some
integer. These values will either be all nonnegative or all nonpositive. The sum of the
elements of this voice-leading’s displacement multiset will therefore be n|c|. The
displacement multiset will contain just two distinct values, 12|r|/k and 12|r+1|/k. This
implies that the displacement multiset is as evenly-distributed as possible, given the
hypothesis that the voice-leading connects subsets of Ek.
S19
NOTES
S20
SYMBOL OR TERM DEFINITION
multiset A set in which duplications are permitted. Like sets, multisets are
unordered.
{a, b, c} A multiset with elements a, b, c.
(a, b, c) An ordered list. (a, b, c) and (b, c, a) are not the same.
x The greatest integer ≤ x.
R The real numbers.
Z The integers.
nZ, where n is a real number The set {ni | i ⊂ Z}. Thus 12Z is the set
{…, -24, -12, 0, 12, 24, …}, whose elements form a group under
addition.
mZn, where m is real and n is an The set of ordered n-tuples (x1, x2, … xn) such that each xi ⊂ mZ.
integer This set forms a group under vector addition.
A/G, where G is some group of the quotient space that identifies all points a and ga, where a ⊂ A
transformations acting on the and g ⊂ G
elements of A
R/12Z The circular quotient space in which all real numbers x and x + 12
have been identified. The group 12Z acts by ordinary addition, so
that every point x has orbits {…, x – 36, x – 24, x – 12, 0, x + 12, x
+ 24, x + 36}.
a ≡nZ b Pitch class a is congruent to b mod nZ. Thus there exists an integer
c such that a = b + cn.
|a|12Ζ The norm of a pitch-class a. The smallest real number |x| such that
x ≡12Z a.
(a1, a2, …, an) ≡12Z (b1, b2, …, bn) For all n, an ≡12Z bn.
Table S1. A glossary of mathematical terms and symbols used in the article.
SYMBOL OR TERM DEFINITION
pitch Pitch is a fundamental attribute of musical notes. Pitches are
typically represented by real numbers such that middle C is 60, the
octave has length 12, and semitones have size 1.
pitch-class An equivalence class of pitches, consisting of all pitches separated
by an integral number of octaves. A220 and A440 both are
instances of the same pitch-class A. Pitch-classes can be
represented by elements of the quotient space R/12Z.
chord A multiset of pitch-classes. It is also possible to consider chords of
pitches, which are simply multisets of real numbers.
transposition Translation in pitch or pitch-class space. In both pitch and pitch-
class space, transposition corresponds to addition by a constant
value. If a is a pitch or pitch-class then a + x is the transposition of
a by x semitones.
Tx(A) The transposition of the chord A by x semitones.
inversion Reflection in pitch or pitch-class space. In both pitch and pitch-
class space, inversion corresponds to subtraction from a constant
value. If a is a pitch or pitch-class, then x – a is an inversion of a.
The quantity “x” is called the index number of the inversion.
Ix(A) The inversion of chord A with index number x.
voice-leading A voice-leading between two multisets {a1, a2, …, am} and {b1, b2,
…, bn} is a multiset of ordered pairs (ai, bj), such that every element
of each chord is in some pair.
trivial voice-leading A trivial voice-leading contains only pairs of the form (x, x).
Table S2. A glossary of musical terms and symbols used in the article.
. . .
{024579e}
t0 } {02
C
579 4
67
{24 F e} {5↔ G 9e}
{t↔ 6}
. .
Bf 0} {0
4} ↔
{12
t
3↔
579
1}
46 7
{
{23
D
9e}
. .
}
{7↔
9
{8↔
{23578t0}
{124689e}
8}
Ef
A
2}
. .
{2↔
{1↔
3}
0}
{13
. . .
Af
8t
{
E 9e}
7}
9↔
4
357
6↔
68
10
{1
f
{
s
B/ {4↔5} 0}
Cf {11↔ /D 0} {13
Fs/Gf
468 C 68t
5
te} {13
{13568te}
Figure S1. The circle of fifths can be interpreted as depicting minimal voice-leadings
between diatonic collections (major scales). Each diatonic collection can be
transformed into its neighbors by voice-leading in which one pitch-class moves by
semitone. For example, the C major scale, containing pitch-classes 0, 2, 4, 5, 7, 9,
and 11 (= e) can be transformed into the G major scale (containing pitch-classes 0, 2, 4,
6, 7, 9, and 11) by moving the pitch class 5 (F) to 6 (Fs). Here as elsewhere, the
letters “t” and “e” refer to the numbers 10 and 11, respectively.
[Bf]
g G gs Af a A [as] [Bf]
e E f F fs Fs
cs Df d D
bf Bf
Figure S2. The Tonnetz. Nineteenth-century theorists such as Hostinsky, Oettingen, and Riemann
explored a geometrical figure that is the “geomterical dual” of the one shown here. The graph displays
efficient voice-leadings among the 24 familiar major and minor triads. Triads connected by horizontal
lines share both “root” and “fifth,” and can be connected by voice-leading in which one note moves by
one semitone. (For example, the C-major triad can be transformed into a C-minor triad by changing
E to Ef.) Triads along the NE/SW diagonal also share two notes and can be connected by single-
semitone voice-leading. (For example, the C-major triad can be transformed into an E-minor triad by
changing C to B.) Triads along a NW/SE diagonal share two notes and can be connected by
voice-leading in which one note moves by two semitones. (For example, the C-major triad can
be transformed into an A-minor triad by changing G to A.) Topologically, the figure is a 2-torus.
.
a) n
.
(a1)
.
b2
x
(a2)
.
b1 m .
a1
.
a2
n
. . .
b)
(a1) b2
(a2)
x
b1
. m
. .
a1 a2
x
n
c)
. .
(a1)
b1 b2
. . (a2)
m
. . a2
a1
.. . . a1
c1 a2 c2
.. . .
b)
b1 b2
d1 d2
.. . . a1
c1 a2 c2
4 8 11 3 4
4 (4)→(4) (4, 4)→(4, 8) (4, 4, 4)→ (4, 4, 4, 4)→ (4, 4, 4, 4, 4)→
(4, 8, 11) (4, 8, 11, 3) (4, 8, 11, 3, 4)
Size: 0 Size: 4 Size: 9 Size: 10 Size: 10
7 (4, 7)→(4, 4) (4, 7)→(4, 8) (4, 7, 7)→ (4, 7, 7, 7)→ (4, 7, 7, 7, 7)→
(4, 8, 11) (4, 8, 11, 3) (4, 8, 11, 3, 4)
Size: 3 Size: 1 Size: 5 Size: 9 Size: 12
11 (4, 7, 11)→ (4, 7, 11)→ (4, 7, 11)→ (4, 7, 11, 11)→ (4, 7, 11, 11, 11)→
(4, 4, 4) (4, 8, 8) (4, 8, 11) (4, 8, 11, 3) (4, 8, 11, 3, 4)
Size: 8 Size: 4 Size: 1 Size: 5 Size: 10
0 (4, 7, 11, 0)→ (4, 7, 11, 0)→ (4, 7, 11, 0)→ (4, 7, 11, 0)→ (4, 7, 11, 0, 0)→
(4, 4, 4, 4) (4, 8, 8, 8) (4, 8, 11, 11) (4, 8, 11, 3) (4, 8, 11, 3, 4)
Size: 12 Size: 8 Size: 2 Size: 4 Size: 8
4 (4, 7, 11, 0, 4)→ (4, 7, 11, 0, 4)→ (4, 7, 11, 0, 4)→ (4, 7, 11, 0, 4)→ (4, 7, 11, 0, 4, 4)→
(4, 4, 4, 4, 4) (4, 8, 8, 8, 8) (4, 8, 11, 11, 11) (4, 8, 11, 11, 3) (4, 8, 11, 11, 3, 4)
Size: 12 Size: 12 Size: 7 Size: 3 Size: 3
B
[00] [10] [20] [30] [40] [50] [60] [70] [80] [90] [t0] [e0] [00]
0e 1e 2e 3e 4e 5e 6e 7e 8e 9e te ee [0e]
0t 1t 2t 3t 4t 5t 6t 7t 8t 9t tt et [0t]
09 19 29 39 49 59 69 79 89 99 t9 e9 [09]
08 18 28 38 48 58 68 78 88 98 t8 e8 [08]
07 17 27 37 47 57 67 77 87 97 t7 e7 [07]
06 16 26 36 46 56 66 76 86 96 t6 e6 [06]
05 15 25 35 45 55 65 75 85 95 t5 e5 [05]
04 14 24 34 44 54 64 74 84 94 t4 e4 [04]
03 13 23 33 43 53 63 73 83 93 t3 e3 [03]
02 12 22 32 42 52 62 72 82 92 t2 e2 [02]
01 11 21 31 41 51 61 71 81 91 t1 e1 [01]
00 10 20 30 40 50 60 70 80 90 t0 e0 [00]
A
Figure S6. Ordered dyad-space is a 2-torus. To identify points (a, b) and (b, a), we need to “fold”
the torus along the AB diagonal. The result of this operation is shown in Figure S7.
B
[00]
ee [e0]
tt te [t0]
99 9t 9e [90]
88 89 8t 8e [80]
D 77 78 79 7t 7e [70]
66 67 68 69 6t 6e [60]
55 56 57 58 59 5t 5e [50]
44 45 46 47 48 49 4t 4e [40]
33 34 35 36 37 38 39 3t 3e [30]
22 23 24 25 26 27 28 29 2t 2e [20]
11 12 13 14 15 16 17 18 19 1t 1e [10]
00 01 02 03 04 05 06 07 08 09 0t 0e [00]
A C
Fig. S7. The result of “folding” the 2-torus in Figure S6 along its diagonal AB. The resulting figure
is a triangle with two of its sides identified, which is a Möbius strip. To transform Figure S7 into
a more familiar representation of a Möbius strip, cut the figure along the line CD and glue AC to CB.
(To make this identification in Euclidean 3-space, you will need to turn over one of the pieces of paper.)
The result is a “square” with opposite sides identified, as in Figure 2 of the main paper.
d0 d1 d2 d3 d4 d5 d6 = W
A A
..
.
.
.
.
.
a0 a1 a2 a3 a4 a5 a6
Figure S8. The cyclical voice-leading (a0, a1, a2, a3, a4, a5, a6)→(a1, a2, a3, a4, a5, a6, a0) has displacement
multiset {d0, d1, d2, d3, d4, d5, d6 = W}. By moving at most three notes by |x – di| semitones, we can make any
of the di = x without changing the other dn ≠ W. That is, to change d0, we need only move a0; to change
d5 we need only move a6; to change d1 we need only move a0 and a1; and so on. In the case of an arbitrary
cyclical voice-leading, we never need to move more than half of a chordʼs notes by |x – di| semitones to “fix”
any interval.