Distributed Algorithm For Computing Persistent Homology
Distributed Algorithm For Computing Persistent Homology
https://doi.org/10.1007/s00454-023-00549-2
Álvaro Torras-Casas1
Abstract
We set up the theory for a distributed algorithm for computing persistent homology.
For this purpose we develop linear algebra of persistence modules. We present bases of
persistence modules, together with an operation that leads to a method for obtaining
images, kernels and cokernels of tame persistence morphisms. Our focus is on devel-
oping efficient methods for the computation of homology of chains of persistence
modules. Later we give a brief, self-contained presentation of the Mayer–Vietoris
spectral sequence. Then we study the Persistent Mayer–Vietoris spectral sequence
and present a solution to the extension problem. This solution is given by finding
coefficients that indicate gluings between bars on the same dimension. Finally, we
review PerMaViss, an algorithm that computes all pages in the spectral sequence
and solves the extension problem. This procedure distributes computations on sub-
complexes, while focusing on merging homological information. Additionally, some
computational bounds are found which confirm the distribution of the method.
1 Introduction
1.1 Motivation
Persistent homology has existed for about two decades [15]. This tool introduced the
field of Topological Data Analysis which, very soon, was applied to a multitude of
Álvaro Torras-Casas
[email protected]
1 School of Mathematics, Cardiff University, Abacws, Senghennydd Road, Cardiff CF24 4AG, Wales,
UK
123
Discrete & Computational Geometry (2023) 70:580–619 581
problems, see [5, 16]. Among others, persistent homology has been applied to study
the geometric structure of sets of points lying in Rn [12, 15], coverage in sensor net-
works [25], pattern detection [23], classification and recovery of signals [24] and it
has also had an impact on shape recognition using machine learning techniques, see
[1, 13]. All these applications motivate the need for fast algorithms for computing
persistent homology. The usual algorithm used for these computations was introduced
in [15], with some later additions to speed up such as those of [7, 8, 26]. In [21] per-
sistent homology is proven to be computable in matrix multiplication time. However,
since these matrices become large very quickly, the computations are generally very
expensive, both in terms of computational time and in memory required.
In practice, computing the persistent homology of a given filtered complex is
equivalent to computing its matrices of differentials and perform successive Gaus-
sian eliminations; see [14, 15]. In recent years, some methods have been developed
for the parallelization of persistent homology. The first approach was introduced in
[14] as the spectral sequence algorithm, and was successfully implemented in [2].
This consists in dividing the original matrix M into groups of rows, and sending these
to different processors. These processors will, in turn, perform a local Gaussian Elimi-
nation and share the necessary information between them, see [2]. On the other hand, a
more topological approach is presented in [18]. It uses the blow-up complex introduced
in [33]. This approach first takes a cover C of a filtered simplicial complex K , and
uses the result that the persistent homology of K is isomorphic to that of the blow-up
complex K C . This proceeds by computing the sparsified persistent homology for each
cover, and then using this information to reduce the differential of K C efficiently. Both
of these parallelization methods have provided substantial speedups compared to the
standard method presented in [15].
The relation between homology classes and a cover is known as Localized Homol-
ogy, which was introduced in [33]. It would be useful to have a method that leads to
the speedups from [2, 18], while keeping Localized Homology information. Further,
such covers should have no restrictions, such as those used in the mapper algorithm,
see [27]. This last point limits substantially the use of the blowup-complex, since the
number of simplices increases very quickly when we allow the intersections to grow.
In fact, in the extreme case where a complex K is covered by n copies of K , the
blowup complex K C has size 2n |K |.
123
582 Discrete & Computational Geometry (2023) 70:580–619
of cohomology groups in [11], and recently in [31] and [32] spectral sequences are
used for distributing persistent homology computations. However, all of [11, 31, 32]
assume that the nerve of the cover is one-dimensional.
The first problem when dealing with spectral sequences is that we need to be able to
compute images, kernels and quotients as well as their representatives. This question
has already been studied in [10], where the authors give a very efficient algorithm.
However, there are couple of problems that come up when using [10] in spectral
sequences. First, in [10] the authors consider a persistence morphism induced by a
simplicial morphism f : X → Y . However, when working with the Mayer–Vietoris
spectral sequence we consider maps in the second, third and higher pages which are
not induced by a simplicial morphism at all. Furthermore, even when working with
the first page differentials, we cannot adapt the work from [10]; as a simplex from an
intersection is sent to several copies along lower degree intersections. Second, a key
assumption in [10] is that the filtrations in X and Y are both general in the sense that
a simplex in either X or Y is born at a time. However, in spectral sequences generality
hardly ever holds. Indeed, this follows from the fact that a simplex might be contained
in various overlapping covers.
Thus, if we want to compute images, kernels and cokernels, we will need to be
able to overcome these two difficulties first. Also, notice that a good solution should
lead to the representatives, as these are needed for the spectral sequence. It is worth
to mention that in [28] such images, kernels and cokernels where studied in terms
of Smith Normal Forms of presentations associated to persistence modules. Further,
as mentioned in [28], such work was developed with the aim of computing persis-
tence spectral sequences [19] in mind. However, adapting [28] to an algorithm which
computes spectral sequences is still a challenge.
The other difficulty with spectral sequences is the extension problem, which we
explain in Sect. 5.1. Within the context of persistent homology, the extension problem
first appeared in [17, Sect. 6]. There the authors give an approximate result that holds
in the case of acyclic coverings. This allows them to compare the persistent homology
to the lower row of the infinity page in the spectral sequence. This leads to an ε-
interleaving between the global persistent homology and that of the filtered nerve.
Later, the extension problem appeared in the PhD thesis of Yoon [31], and also in
the recent joint work with Ghrist [32]. In Sect. 4.2.3 from Yoon’s Thesis, there is a
detailed solution for the extension problem in the case when the nerve of the cover is
one-dimensional.
In this paper, we set the theoretical foundations for a distributed method on the input
data. In order to do this, we use the algebraic power of the Mayer–Vietoris spectral
sequence. Since the aim is to build up an explicit algorithm, we need to develop
linear algebra of persistence modules, as done through Sect. 3. There, we define
barcode bases and also we develop an operation that allows to determine whether
a set of barcode vectors is linearly independent or not. Using this machinery, we are
able to encapsulate all the information related to a persistence morphism in a matrix
123
Discrete & Computational Geometry (2023) 70:580–619 583
that depends on the choice of two barcode bases. This allows defining a Gaussian
elimination outlined in the box_gauss_reduce algorithm—see Algorithm 1—
which addresses the two issues raised above with regards to [10].
Next in Sect. 4, we give a detailed review of the Mayer–Vietoris spectral sequence
in the homology case. This is followed by Sect. 5.1, where we give a solution to the
extension problem. The solution is given by a careful consideration of the total complex
homology, together with the use of barcode basis machinery developed in Sect. 3.
In Sect. 5.2 we introduce PerMaViss, an algorithm for computing the persistence
Mayer–Vietoris spectral sequence and solving the extension problem. The advantage
of this procedure is that all the simplicial information is enclosed within local matrices.
This has one powerful consequence; this method consists in computing local Gaussian
eliminations plus computing image_kernel on matrices whose order is that of
homology classes. In particular, given enough processors and a ‘good’ cover of our
data, one has that the complexity is about
O(X 3 ) + O(H 3 ),
where X is the order of the maximal local complex and H is the overall number of
nontrivial persistence bars on the whole dataset.1 For more details on this, we refer
the reader to Sect. 5.4.
By using the ideas in this text we developed PerMaViss, a Python library that com-
putes the Persistence Mayer–Vietoris spectral sequence. In the results from [29], one
can see that nontrivial higher differentials come up and also the extension problem can-
not be solved in a trivial way in general. This supports the idea that the spectral sequence
adds more information on top of persistent homology. Finally, we outline future direc-
tions, both for the study of the Persistence Mayer–Vietoris spectral sequence and future
versions of PerMaViss.
2 Preliminaries
Definition 2.1 Given a set X , a simplicial complex K is a subset of the power set
K ⊆ P(X ) such that if σ ∈ K , then for all subsets τ ⊆ σ we have that τ ∈ K .
An element σ ∈ K will be called an n-simplex whenever |σ | = n + 1, whereas a
subset τ ⊆ σ will be called a face. Thus, if a simplex is contained in K all its faces
must also be contained in K . Given a simplicial complex K , we denote by K n the set
containing all the n-simplices from K . Given a pair of simplicial complexes K and L,
if L ⊆ K , then we say that L is a subcomplex of K . Also, given a mapping f : K → L
between two simplicial complexes K and L, we call f a simplicial morphism whenever
1 Notice that in a filtration indexed by integers where we introduce a simplex at a time there are no trivial
bars and the number of these is about (# simplices)/2. Here we are referring to the case where the filtration
is over the real numbers; where many simplices are introduced at the same time. In this case the number of
nontrivial persistent homology bars should be much smaller than the number of simplices.
123
584 Discrete & Computational Geometry (2023) 70:580–619
n
f (K n ) ⊆ l=0 L l for all n ≥ 0. The category composed of simplicial complexes and
simplicial morphisms will be denoted by SpCpx.
We represent the simplices from K as equivalence classes of tuples from X which are
equal up to even permutations, see [22, Sect. 5]. Let F be a field. For each n ≥ 0
we define the free vector space Sn (K ) := F[K n ]. We also consider linear maps
dn : Sn (K ) → Sn−1 (K ) usually called differentials, defined by
n
dn ([v0 , . . . , vn ]) = (−1)i [v0 , . . . , vi−1 , vi+1 , . . . , vn ]. (1)
i=0
0 d1 d2 d3
0 S0 (K ) S1 (K ) S2 (K ) ... (2)
It follows from formula (1) that the composition of two consecutive differentials van-
ishes: dn ◦ dn−1 = 0 for all n ≥ 0. In this case we say that (2) is a chain complex. As
a consequence, we have that Im(dn+1 ) ⊆ Ker(dn ), and we can define the homology
with coefficients in F to be Hn (K ; F) = Ker(dn )/Im(dn+1 ) for all n ≥ 0. In general,
F will be understood by the context and the notation Hn (K ) might be used instead.
On the other hand, we consider the augmentation map ε : S0 (K ) → F defined by
the assignment s → 1F , for any simplex s ∈ S0 (K ). Then, we define the reduced
homology by H0 (K ; F) = Ker(ε)/Im(d1 ) and Hn (K ; F) = Hn (K ; F) for all n > 0.
Consider S∗ (K ), obtained by augmenting (2) by ε and a copy of F in degree −1:
0 ε d1 d2 d3
0 F S0 (K ) S1 (K ) S2 (K ) ...
0 ε d1
0 F S0 (Δm ) S1 (Δm )
d2 d3 dn
S2 (Δm ) ... Sn (Δm ) 0.
N U = {σ : Uσ = ∅} ⊆ Δm .
123
Discrete & Computational Geometry (2023) 70:580–619 585
Definition 2.4 (Čech chain complex) Let K be a simplicial complex and let U =
{Ui }i=0
m be a cover of K . Given s ∈ K , there exists σ (s) ∈ N U with maximal
cardinality |σ (s)|, so that s ∈ Uσ (s) . Then, we define the (n, U)-Čech chain complex
by
|σ (s)|
Č∗ (n, U; F) = f ∗σ (s)
S∗ Δ .
s∈K n
For k ≥ −1, we use the notation (τ )s with s ∈ K n and τ ∈ Sk (Δ|σ (s)| ), to denote an
element in Čk (n, U; F) that is zero everywhere except for τ in the component indexed
by s. Then the image of the k-Čech differential is defined by the assignment δ̌kU ((τ )s ) =
U U
(d N τ )s , where d N denotes the kth differential of
k k S∗ (N U ). By definition, the Čech
complex is a chain complex and is exact. Also, Č−1 (n, U; F) Sn (K ) follows easily.
On the other hand, for each k ≥ 0 we define an isomorphism
ψk : Čk (n, U; F) Sn (Uσ )
σ ∈NkU
σ (s)
by sending (τ )s to (s)τ for any pair of simplices s ∈ K n and τ ∈ f k Δ|σ (s)| . In
particular, we can rewrite the (n, U)-Čech chain complex as a sequence
δ0 δ1 δ2
0 Sn (K ) Sn (Uσ ) Sn (Uσ ) Sn (Uσ ) . . . (3)
σ ∈Δm
0 σ ∈Δm
1 σ ∈Δm
2
where, for any pair σ ∈ NkU and s ∈ (Uσ )n , the differentials δi are defined as follows:
U U
δk ((s)σ ) = ψk ◦ δ̌k ◦ ψk−1 ((s)σ ) = ψk dkN σ s = dkN (σ ) τ · s τ ∈N U ,
k−1
U U U .
and where {dkN (σ )}τ ∈ F is the coefficient of dkN (σ ) in the simplex τ ∈ Nk−1
Remark 2.5 Alternatively, the Čech chain complex can be defined straight away as the
sequence (3), and prove exactness by using cosheaf theory. Namely, given a simplicial
complex K , we consider the topology where the open sets are subcomplexes. Then,
for each integer n ≥ 0, consider the simplicial precosheaf as an assignment Sn : V →
Sn (V ) for each subcomplex V ⊆ K . This precosheaf is in fact a flabby cosheaf. Then,
using 2.5, 4.3, and 4.4 from [4, Sect. VI], one has exactness of the Čech chain complex.
123
586 Discrete & Computational Geometry (2023) 70:580–619
Let R be the category of real numbers as a poset, where homR (s, t) contains a single
morphism whenever s ≤ t, and is empty otherwise. Let F be a field and let Vect denote
the category of F-vector spaces.
V(s≤t)
Vs Vt
fs ft
W(s≤t)
Ws Wt .
Example 2.8 Let s ≤ t from R. We define the interval module F[s,t) by F[s,t) (r ) = F
for all r ∈ [s, t) and F[s,t) (r ) = 0 otherwise. The morphisms F[s,t) (a ≤ b) are the
identity for any two a, b ∈ [s, t) and 0 otherwise.
Given F[s,t) , s and t are the birth and death values respectively. If V(r ) is finite
dimensional for all r ∈ R, then there is an isomorphism V i∈J F[si ,ti ) , as shown
in [6]. This is the barcode decomposition of V.
123
Discrete & Computational Geometry (2023) 70:580–619 587
In this section, we introduce barcode bases and the operation . This framework
allows to introduce the matrix associated to a persistence morphism f : V → W and
a choice of bases. This theory is applied to develop algorithms for computing images
and kernels of persistence morphisms as well as quotients of persistence modules.
In addition, we evaluate the respective computational complexities. Additionally, we
illustrate how to compute homology in the category of persistence modules.
Definition 3.1 (barcode basis) A barcode basis A of a tame persistence module V
N
is a choice of an isomorphism, α : i=1 F[ai ,bi ) → V. A direct summand of α is a
restricted morphism αi : F[ai ,bi ) → V which we call a barcode generator. Often, we
denote a barcode basis A by the set of barcode generators A = {αi }i=1 N .
Ar = {αi : 1 ≤ i ≤ N , αi (r ) = 0}.
all r ∈ R.
Proof Since Vect F is an abelian category and R is a small category, α is an isomorphism
if and only if α(r ) is an isomorphism for all r ∈ R. That is, the kernel Ker(α) →
N
i=1 F[ai ,bi ) vanishes iff Ker(α)(r ) = 0 for all r ∈ R. A similar argument is done
for surjectivity. Then, α(r ) is an isomorphism iff Ar (1F ) is a base for V(r ) and the
result follows.
Next, we use barcode bases to understand persistence morphisms f : V → W. In
particular, fixing a pair of barcode bases A and B for V and W respectively, we show
that there is a unique matrix F associated to f , see Corollary 3.12. However, this
makes no sense unless we are able to perform additions on barcode generators, which,
as shown in the following example, cannot be done in a straight-away manner.
123
588 Discrete & Computational Geometry (2023) 70:580–619
Example 3.3 Consider V F[0,2) ⊕ F[1,3) together with the canonical basis A given
by generators α1 ∼ [0, 2) and α2 ∼ [1, 3). Then, it is not possible to add α1 and α2
since at some points the domains differ; for example at filtration value 0 we have that
α1 (0) has F as domain but α2 (0) has 0 as domain.
To fix this, consider the following set of pairs:
which sends ((γ , (aγ , bγ )), (τ, (aτ , bτ ))) to the pair (γ τ, (max (aγ , aτ ), Bγ τ ))
where we set
min (bγ , bτ ) if bγ = bτ ,
Sγ τ =
sup {r ∈ [max(aγ , aτ ), bγ ) : γ (r ) + τ (r ) = 0} if bγ = bτ ,
and
max(bγ , bτ ) if bγ = bτ ,
Bγ τ =
Sγ τ if bγ = bτ .
One can check that γ τ : F[max(aγ ,aτ ),Bγ τ ) → V is a well-defined persistence mor-
phism.
By definition, is commutative and γ τ (r ) = 0 iff r ∈ [max(aγ , aτ ), Bγ τ ). For
brevity, given (γ , (aγ , bγ )) ∈ PVect(V), we refer only to the first component γ .
By abuse of notation, we say “given a persistence vector γ ∈ PVect(V)” or “given
a subset of persistence vectors S ⊆ PVect(V)”. Also, elements Z r ∈ Z behave
nontrivially with respect to ; for example, given a persistence vector γ ∼ [aγ , bγ )
and considering c > bγ we have that γ Z c = Z c .
123
Discrete & Computational Geometry (2023) 70:580–619 589
for all r ∈ R with A ≤ r . Since L(r ) = 0 iff r ∈ [A, B L ) and R(r ) = 0 iff r ∈ [A, B R )
we must have B L = B R and the equality L = R holds.
Now we are ready to introduce the key for characterizing barcode bases.
Example 3.8 Suppose that {α1 ∼ [0, 2), α2 ∼ [0, 1)} is a barcode basis of V
F[0,2) ⊕ F[0,1) . Then, {α1 , α2 } is linearly independent as will follow from Propo-
sition 3.11. On the contrary, α1 and α1 α2 are not linearly independent since
(−α1 ) (α1 α2 ) = α2 is associated to [0, 1) but −α1 ∼ [0, 2) and α1 α2 ∼ [0, 2).
Notice that λ and {1s }s∈R are compatible, in the sense that 1r (cγ ) = c1r (γ ) for
all γ ∈ PVect(V), all r ∈ R and all c ∈ F. Also, given γ , τ ∈ PVect(V) and
s, r ∈ R, it follows that 1s (γ
) 1r (τ ) = 1max(s,r ) (γ τ ). Thus, persistence vectors
on V correspond to a tuple PVect(V), , λ, {1s }s∈R .
Definition 3.10 Given T ⊆ PVect(V), we say that T generates PVect(V) iff for any
γ ∈ PVect(V)\Z, there exists S ⊆ T together with coefficients kγ ∈ F\{0} for all
γ ∈ S, and some s ∈ R such that
α = 1s k γ
γ ∈S
γ .
123
590 Discrete & Computational Geometry (2023) 70:580–619
It follows that S ⊂ Baα since adding elements from B \ Baα would have no effect or
would cut the startpoint to a value greater than aα . Also, notice that if β(bα ) = 0 then
β ∈ / S, since otherwise f would not be natural as a persistence morphism. Thus, S
must be a subset of
123
Discrete & Computational Geometry (2023) 70:580–619 591
Corollary 3.12 Let V and W be a pair of persistence modules together with their
respective barcode bases A and B. Given a persistence morphism f : V → W, there
is a unique associated matrix F = (kβ,α )β∈B,α∈A which is well defined in the sense
that whenever kβα = 0 then β ∈ B(α). Conversely, assume that F is well defined,
then there exists a unique persistence morphism f : V → W whose associated matrix
is F.
We end this section by introducing different orders of barcode bases. These orders are
important to introduce Gaussian eliminations in the barcode basis context.
Definition 3.13 Let V be a persistence module with barcode base A. Also, let αi , α j ∈
A with αi ∼ [ai , bi ) and α j ∼ [a j , b j ). We consider two orders in A:
The standard order: αi < α j if either ai < a j or ai = a j and bi > b j .
The endpoint order: αi < α j if either bi < b j or bi = b j and ai < a j .
If V is tame, it is straightforward to extend these orders to total orders for A.
Consider two finite barcode bases A = {αi }i=1 n and B = {β j }mj=1 for V and W,
respectively. Additionally, suppose that A is ordered according to the standard order
while B is ordered using the endpoint order. We assume such orders are total; e.g.,
even if αr , αs ∼ [a, b) for r = s, either αr < αs or αs < αr holds. Then, we consider
f (A)B = ( f (α1 ), . . . , f (αn )), the matrix of f in the bases A and B. In Sect. 3.3, we
transform f (A)B performing left to right column additions until obtaining a reduced
matrix, i.e., with unique column pivots,
n−1
IB = f (α1 ) f (α2 ) k2,1 f (α1 ) . . .
f (αn )
j=1
kn, j f (α j ) (4)
for suitable ki, j ∈ F and 1 ≤ j < i ≤ n. This IB has the property that its non-zero
columns form a basis I for Im( f ).
Definition 3.14 Given S ⊆ B, consider a vector V = (kβ )β∈B such that kβ = 0 iff
β ∈ S. The pivot of V is the greatest element from S in the endpoint order. We also
refer to the pivot of S or the pivot of β∈S kβ β.
123
592 Discrete & Computational Geometry (2023) 70:580–619
α1 α3 β3
5
α2 β2
4
β1
3
2
1
0
Ker(f ) V Im(f ) W
Fig. 1 Decomposition of barcodes in image, kernel, domain and codomain of f : V → W. The colors
correspond to the different generators associated to
I and K
Consider again the matrix IB from (4). By linearity, the jth column from I is
f α j i=1 k j,i αi ; thus, its preimage is p j = α j i=1 k j,i αi and we define the
j−1 j−1
with canonical barcode bases (α1 , α2 , α3 ) and (β1 , β2 , β3 ) ordered respectively using
the standard and endpoint orders. Let f : V → W be given by the |B| × |A| matrix
f (A)B :
⎛ ⎞ ⎛ ⎞
α1 α2 α3 α1 α2 (−α1 ) α3
⎜ β1 1 0 0 ⎟ ⎜ β1 1 −1 0 ⎟
f (A)B = ⎜
⎝ β2 1 1 1 ⎠
⎟ IB = ⎜ ⎝ β2 1
⎟ .
0 1⎠
β3 0 0 1 β3 0 0 1
Then notice that the first two columns from f (A)B share the same pivot β2 , while
the third’s column pivot is β3 . We subtract the first column to the second, leading to
123
Discrete & Computational Geometry (2023) 70:580–619 593
the matrix IB above which has unique pivots for each column. From IB we obtain
I = 11 (β1 β2 ), −11 (β1 ), 12 (β2 β3 ) ,
which leads to the barcode decomposition Im( f ) F[1,4) ⊕ F[1,3) ⊕ F[2,5) . At the
same time, we obtain a corresponding set of preimages PI = {α1 , α2 (−α1 ), α3 }
From this we deduce the set of kernel generators GK and order it by the standard
barcode order GK = {13 (α2 (−α1 )), 14 (α1 ), 15 (α3 )}. Thus, we consider the matrix
GKA for the kernels,where the rows correspond to the endpoint order on A, and reduce
it:
⎛ ⎞
13 (α2 (−α1 )) 14 (α1 ) 15 (α3 )
⎜ α2 1 0 0⎟
GKA = ⎜ ⎝ α1
⎟
−1 1 0⎠
α3 0 0 1
⎛ ⎞
13 (α2 (−α1 )) 14 (α2 ) 15 (α3 )
⎜ α2 1 1 0⎟
KA = ⎜ ⎝ α1
⎟
−1 0 0⎠
α3 0 0 1
Notice that the second and third columns from K are trivial since 14 (α2 ) = Z 4 and
15 (α3 ) = Z 5 . Since K contains a single nontrivial element we have obtained a basis
for the kernel K = {13 (α2 (−α1 ))}. Thus, ker( f ) F[3,5) .
123
594 Discrete & Computational Geometry (2023) 70:580–619
Algorithm 1 box_gauss_reduce
Input: A, B, f (A)B , where B follows the endpoint order
Output: I, PIA .
Id|A|
1: Set M = and store pivots of columns in a list called lpivots.
f (A)B
2: With the same order, store birth values of elements from A into lbirths.
3: With the same order, store death values of elements from B into ldeaths.
4: for p = |A| + |B| ≥ . . . ≥ |A| + 1, with p decreasing, do
5: Find all 1 ≤ i ≤ |A| such that lpivots[i] == p; store into a new ordered list called lpiv_idx.
6: while L = length(lpiv_idx) > 1 do
7: Find 1 ≤ m < L such that lbirths[lpiv_idx[m]] is minimal. For multiple choices, take the
smallest m.
8: for j = L ≥ . . . ≥ m + 1 do
9: idx ← lpiv_idx.pop()
10: lbirths[idx] ← max(lbirths[idx],
lbirths[lpiv_idx[m]])
M[ p, idx]
11: M[:, idx] ← M[:, idx] − M[:, lpiv_idx[m]]
M[ p, lpiv_idx[m]]
12: for k = |A| + 1 ≤ . . . ≤ p do
13: If ldeaths[k] <= lbirths[idx], set M[k, idx] ← 0.
14: end for
15: lpivots[idx] ← pivot from M[:, idx].
16: end for
17: end while
18: end for
PIA
19: Denote by PIA and IB the matrices such that M = .
IB
20: Compute S i = {β j ∈ B : IB [ j, i] = 0} for all 1 ≤ i ≤ |A|.
21: Set I = 1ai β ∈S i M[ j, i]β j 1≤i≤|A| , where ai = lbirths[i].
j
22: return I, PIA
and
Proposition 3.17 Algorithm 2 computes K I bases for the kernel and image of f .
2 Here we use the Numpy notation for matrices, where for a matrix M, the (i, j)-entry is denoted by
M[i,j] and the jth column is denoted by M[:,j]. Also, we use indexing starting at 1 instead of 0 to be
consistent with the rest of the article. In addition, we make use of the pop() function, which simultaneously
returns and deletes the last element from a list.
123
Discrete & Computational Geometry (2023) 70:580–619 595
Algorithm 2 image_kernel
Input: A, B, f (A)B ; A and B follow the startpoint and endpoint orders resp.
Output: K, I, P
I
1: I, PIA ← box_gauss_reduce (A, B, f (A)B ).
2: Compute T i = α j ∈ A : PIA [ j, i] = 0 for all 1 ≤ i ≤ |A|.
3: Set PI = α ∈T i PIA [ j, i]α j 1≤i≤|A| .
j
4: Set GK to contain all 1ci ( pi ) for all pi ∈ PI with f ( pi ) ∼ [ai , ci ).
5: Order GK with standard order, while A with the endpoint order.
6: Write the matrix of coordinates, GKA , of GK in terms of A.
7: K, _ ← box_gauss_reduce (GK, A, GKA )
8: We get rid of zero elements to obtain K and
I from K and I.
9:
return K, I and P I (where P I are the preimages of I taken from PI).
Proof First of all, by Proposition 3.16, we know that I and K are linearly independent,
as these are both sets of persistence vectors with different pivots. Thus, all we need to
show is that both sets generate PVect (Im( f )) and PVect (Ker( f )), respectively.
Let us prove that
I generates PVect (Im( f )). First, wewill show that PI generates
PVect(V). Consider γ ∈ PVect(V) and write γ = 1a i∈I ki αi for coefficients
ki ∈ F\{0} with i ∈ I for some subset I ⊆ {1, 2, . . . , |A|}. Then, consider the
maximum index m from I and compute γ = γ (−km pm ). Here it is key to recall
that the preimage pm ∈ PI iswritten as αm i=1 m−1
km,i αi for coefficients km,i ∈ F
for 1 ≤ i < m. Now, γ = 1a i∈J ki αi for coefficients ki ∈ F\{0} with i ∈ J for
some subset J ⊆ {1, 2, . . . , m − 1}. Repeating this argument, eventually, we write
γ in terms of GI. This implies that f (γ ) can be expressed in terms of I. Thus, I
generates PVect (Im( f )).
Now, let us show that K generates PVect (Ker( f )). In fact, it will be enough to show
that GK generates PVect (ker( f )). This is because K is obtained from reducing GKA in
a similar manner as I was obtained by reducing f (A)B . Consequently, by replicating
the argument which proved that PI generates PVect(V), it follows that K generates
PVect (ker( f )). So let us prove our claim. Suppose that γ : F[a,b) →V lies in the kernel;
i.e., f (γ ) = Z a . As PI generates PVect(V), we have that γ = 1a i∈I ki pi for
coefficients ki ∈ F\{0} with i ∈ I for some subset I ⊆ {1, 2, . . . , |A|}. Applying f ,
we obtain the equality f (γ ) = Z a = i∈I ki f ( pi ) and notice that f ( pi )(a) = 0 for
all i ∈ I ; otherwise linear independence of I, and in particular that of I a (1F ), would
be contradicted. However, if f ( pi )(a) = 0 for all i ∈ I , then 1ci ( pi ) ∈ GK for some
ci ≤ a and all i ∈ I . Altogether we obtain that γ must be generated by GK.
Notice that Algorithm 1 for the cases from Proposition 3.17 is simply a Gaussian
elimination where the procedure differs from the standard method. In the former case,
the input A in Algorithm 1 is given ordered in the startpoint order. However, this
hypotheses is not assumed in Sect. 3.4. Next, we give a computational bound.
Proposition 3.18 Algorithm 2 takes at most O(N 2 |A|) time, where N = max(|A|, |B|).
Proof Let us start by measuring the computational complexity of Algorithm 1,
box_gauss_reduce. First, the for loop from line 4 iterates |B| times. Next, for
each iteration of line 4 the pop() function from line 9 cannot be executed more
than L times, where L ≤ |A|. Then, lines 9 to 15 have a computational cost of
123
596 Discrete & Computational Geometry (2023) 70:580–619
V
Γ G : F[aΓ ,cΓ ) → ,
H
123
Discrete & Computational Geometry (2023) 70:580–619 597
V G is associated to [A, r ) for some value r ∈ [A, C). This implies that 1r (V G ) is in
PVect(H), where we define V G := Γ ∈S kΓ Γ G .
Next, take the greatest Γ ∈ S whose endpoint is C. Thus, there must exist R ∈ R
such that r ≤ R < C and 1 R (Υ ) = Z R for all Υ ∈ S such that Υ > Γ . Equivalently,
1 R (Υ G ) ∈ PVect(H) for all Υ ∈ S such that Υ > Γ . Since 1 R (V G ) is in PVect(H),
we conclude that 1 R (Γ G ) can be written in terms of generators from ιH (H) R and
elements ΓG with Γ < Γ ; where, from the third bullet point after Definition 3.1,
ιH (H) R denotes the set of generators γ ∈ ιH (H) such that γ (R) = 0. Now, consider
the matrix I[G]G G
A whose columns correspond to the coordinates of Υ in terms of A
for all Υ ∈ I[G]. Then, consider the block matrix M = (ι (H)A | I[G]G
H R
A ). In matrix
G
terms, our hypotheses on 1 R (Γ ) means that its corresponding column from M can
be reduced by left to right column additions on M up to a pivot whose associated
interval endpoint is smaller or equal to R.
Now, denote N = (ιH (H)A R | ιG (G) ). It is not difficult to notice that M is the
A
result of applying left-to-right column additions to N . Consequently, denoting by CΓ
the column from N that corresponds to Γ G , the column CΓ can be reduced by left
columns in N up to a pivot whose associated interval endpoint is smaller or equal to R.
There are two options:
d1 d2 dn
0 V0 V1 V2 ... Vn ,
123
598 Discrete & Computational Geometry (2023) 70:580–619
H0 (T2 ) = F F ⊕ F = H0 (U ) ⊕ H0 (V ), H2 (T2 ) = F 0 = H2 (U ) ⊕ H2 (V ).
To amend this, one has to look at the information given by the intersection U ∩ V .
This information comes as identifications and new loops. For example, U and V are
connected through U ∩ V . Also, the loop going around each cylinder U and V is
identified in U ∩ V . These identifications are performed by taking the quotient
In := coker Hn (U ∩ V ) → Hn (U ) ⊕ Hn (V )
for all n ≥ 0. Where the previous morphism is the Čech differential δ1n : Sn (U ∩ V ) →
Sn (U ) ⊕ Sn (V ). Additionally, the 1-loops in the intersection merge to the same loop
when included in each cylinder U or V . This situation creates a 2-loop or “void”, see
Fig. 2. Thus we have the n-loops detected by the kernel
L n := Ker Hn−1 (U ∩ V ) → Hn−1 (U ) ⊕ Hn−1 (V )
for all n ≥ 0. Notice that n-loops are found by n − 1 information on the intersection.
Putting all together, we have that
H0 (T2 ) ∼
= I0 ∼
= F, H1 (T2 ) ∼
= I1 ⊕ L 1 ∼
= F⊕ F, H2 (T2 ) ∼
= L2 ∼
= F.
and there are expressions for the different ratios between consecutive filtrations,
F 1 (Hn (U ∪ V ))
F 0 (Hn (U ∪ V )) = In , = Ln.
F 0 (Hn (U ∪ V ))
123
Discrete & Computational Geometry (2023) 70:580–619 599
U U ∩V V U U ∩V V
Identifications: ∼ ∼
H0 H1
Loops: ∼
= S
1 ⇒ ‘Inside’
with morphism dnTot = (d, d, d − δ1 ) for all n ≥ 0. Notice that the first two morphisms
do not change components, whereas the third encodes the “merging” of information.
This last morphism is represented by red arrows on the diagram:
Totn+1 (S∗ ) ∼
= Sn+1 (U ) ⊕ Sn+1 (V ) ⊕ Sn (U ∩ V )
Tot
dn+1 dn+1 dn
δ1
Totn (S∗ ) ∼
= Sn (U ) ⊕ Sn (V ) ⊕ Sn−1 (U ∩ V )
dnTot dn dn−1
δ1
Totn−1 (S∗ ) ∼
= Sn−1 (U ) ⊕ Sn−1 (V ) ⊕ Sn−2 (U ∩ V )
where the rectangle of red arrows is commutative. In particular, this implies that
dnTot ◦ dn+1
Tot = 0 for all n ≥ 0. Computing the homology with respect to the total
Hn (Tot∗ (S∗ )) ∼
= In ⊕ L n ∼
= Hn (K ).
123
600 Discrete & Computational Geometry (2023) 70:580–619
can extend the intuition from the previous subsection, by recalling the definition of the
(n, U)-Čech chain complex given on the preliminaries. Stacking all these sequences
on top of each other, and also multiplying differentials in odd rows by −1, we obtain
a diagram:
δ0 δ1 δ2
0 S2 (K ) S2 (Uσ ) S2 (Uσ ) S2 (Uσ ) ...
σ ∈Δm
0 σ ∈Δm
1 σ ∈Δm
2
d d d d
δ0 δ1 δ2
0 S0 (K ) S0 (Uσ ) S0 (Uσ ) S0 (Uσ ) ...
σ ∈Δm
0 σ ∈Δm
1 σ ∈Δm
2
0 0 0 0
for all p, q ≥ 0, and also S p,q := 0 otherwise. We denote δ̄ = (−1)q δ, the Čech
differential multiplied by a −1 on odd rows. The reason for this change of sign is
because we want S∗,∗ to be a double complex, in the sense that the following equalities
hold:
δ̄ ◦ δ̄ = 0, d ◦ d = 0, δ̄ ◦ d + d ◦ δ̄ = 0. (5)
Since S∗,∗ is a double complex, we can study the associated chain complex S∗Tot ,
commonly known as the total complex. This is formed by taking the sums of anti-
diagonals
SnTot := S p,q
p+q=n
for n ≥ 0. The differentials on the total complex are defined by d Tot = d + δ̄,
which satisfy d Tot ◦ d Tot = 0 from (5), see Fig. 3 for a depiction of this. Later, in
Proposition 4.1, we prove that Hn (K ) ∼= Hn (S∗Tot ) for all n ≥ 0. The problem still
123
Discrete & Computational Geometry (2023) 70:580–619 601
S∗,∗ S∗,∗
β0 β0
d d
α0 β1 0 β1
δ̄ δ̄
d S4Tot d Ker(d Tot )4
α1 β2 0 β2
δ̄ δ̄
S3Tot d d
α2 β3 0 β3
δ̄ δ̄
d d
α3 β4 0 β4
δ̄ δ̄
Fig. 3 S∗,∗ represented as a lattice for convenience. On the left, the total complex S Tot associated to S∗,∗ .
Here (β0 , . . . , β4 ) ∈ S4Tot maps to (α0 , . . . , α3 ) ∈ S3Tot , where d(βi ) + δ̄(βi+1 ) = αi for all 0 ≤ i ≤ 3.
On the right, the kernel Ker(d Tot )4 , where d(βi ) + δ(βi+1 ) = 0 for all 0 ≤ i ≤ 3
remains difficult, since computing Hn (S∗Tot ) directly might be even harder than com-
puting Hn (K ). The key is that the Mayer–Vietoris spectral sequence allows us to break
apart the calculation of Hn (S∗Tot ) into small, computable steps.
Let us start by computing the kernel Ker(dnTot ), which is depicted in Fig. 3.
Recall that in this section we are working with vector spaces and linear maps.
Let s = (sk,n−k )0≤k≤n ∈ SnTot be in Ker(dnTot ). Then, the equations d(sk,n−k ) =
−δ̄(sk+1,n−k−1 ) hold for all 0 ≤ k < n. This leads to subspaces GK p,q ⊆ S p,q com-
posed of elements s p,q ∈ S p,q such that d(s p,q ) = 0 and such that there exists
a sequence s p−r ,q+r ∈ S p−r ,q+r with d(s p−r ,q+r ) = −δ̄(s p−r +1,q+r −1 ) for all
0 < r ≤ p. Notice that GK p,q is a subspace of S p,q since both d and δ̄ are lin-
ear. This is depicted in Fig. 4. There are (non-canonical) isomorphisms,
Ker dnTot ∼= GK p,q . (6)
p+q=n
It turns out that (6) only holds when we are working with vector spaces. Later, we
work with a more general case where we have to solve nontrivial extension problems.
By (6), recovering all GK p,q leads to the kernel of d∗Tot . However, computing
GK p,q still requires a large set of equations to be checked. A step-by-step way of
computing these is by adding one equation at a time. For this, we define the subspaces
GZrp,q ⊆ S p,q where we add the first r equations progressively. That is, we start
setting GZ0p,q = S p,q . Then we define GZ1p,q to be elements s p,q ∈ S p,q such that
d(s p,q ) = 0, or equivalently GZ1p,q = Ker(d) p,q . For r ≥ 2, we define GZrp,q to be
formed by elements s p,q ∈ Ker(d) p,q such that there exists a sequence s p−k,q+k ∈
S p−k,q+k with d(s p−k,q+k ) = −δ̄(s p−k+1,q+k−1 ) for all 1 ≤ k < r . Altogether,
p+1 p
GK p,q = GZ p,q ⊆ GZ p,q ⊆ . . . ⊆ GZ0p,q = S p,q
123
602 Discrete & Computational Geometry (2023) 70:580–619
GK0,3
β0 β0 GZ02,1 GZ12,1
d d GK1,2 = S2,1 = Ker(d)2,1
0 0 0 β1 α1 β2 α1 β2
δ̄ δ̄
d d
0 0 0 0 α2 0
δ̄ δ̄
d d
0 0 0 0 α0 β1
δ̄ δ̄ GZ22,1
0 β2
β0 β0 β0 0
d d
GZ32,1
0 β1 0 β1 0 β1
δ̄ δ̄ = GK2,1
d GK2,1 d
0 β2 0 β2 0 β2
δ̄ δ̄
d d GK3,0
0 0 0 β3 0
δ̄ δ̄
Fig. 4 On the left, in cyan the four direct summands of Ker(d Tot )4 . The corresponding GKr ,3−r are framed
to indicate that these are subspaces of Sr ,3−r for all 0 ≤ r ≤ 3. On the right, in orange the subspaces GZr2,1 ,
eventually shrinking to GK 2,1 . For convenience, we denote α2 = d(β2 ), α1 = δ̄(β2 ) and α0 = δ̄(β1 )
for all p, q ≥ 0. For intuition see Fig. 4, and also Fig. 6 for a depiction of GZ23,1
on a lattice. We also write GZrp,q = Ker(d) ∩ (δ̄ −1 ◦ d)r −1 (S p−r +1,q+r −1 ) for all
r ≥ 1, where by (δ̄ −1 ◦ d)r we denote composing r times δ̄ −1 ◦ d. In particular, since
GZrp,q = GZ p,q for all r ≥ p + 1, we use the convention GZ∞
p+1 p+1
p,q := GZ p,q = GK p,q .
Now, we explain the notation GK p,q and the isomorphism (6). We start defining a
vertical filtration F ∗ on S∗,∗ by the following subcomplexes for all r ≥ 0:
S p,q whenever p ≤ r ,
F r (S∗,∗ ) p,q :=
0 otherwise.
F r SnTot ∩ Ker(d Tot )n for all r ≥ 0. We define the associated modules of Ker(d Tot )n
to be the quotients G p Ker(d Tot )n = F p Ker(d Tot )n /F p−1 Ker(d Tot )n , which can be
checked to be isomorphic with GK p,q for all p + q = n. This follows by considering
123
Discrete & Computational Geometry (2023) 70:580–619 603
F 2 (S∗,∗ ) Sp, ∗
δ̄
S∗,∗
d
F 3 (S ∗,∗ )
morphisms
G p Ker d Tot n GK p,q ,
(7)
[(s0,n , s1,n−1 , . . . , s p,q , 0, . . . , 0)] s p,q ,
which are well defined since s p,q does not change for representatives of the same class.
In fact, this morphism is injective since two classes with the same image will be equal
by definition of G p Ker(d Tot )n . On the other hand, the definition of GK p,q ensures
surjectivity. In particular, as we are working with vector spaces, we have that
Ker dnTot ∼= G p Ker d Tot n ∼
= GK p,q ,
p+q=n p+q=n
Z rp,q := z ∈ F p S Tot
p+q : d
Tot
(z) ∈ F p−r S Tot
p+q−1
for all r ≥ 0. We can think of these as kernels of d Tot up to some previous fil-
tration. Then, by definition, we have that Z 0p,q = F p S Tot
p+1 ∞
p+q and Z p,q = Z p,q =
F Ker(d p+q ). Using a morphism analogous to (7), one can check that the quotients
p Tot
+1 /Z r r +1
Z rp,q p−1,q+1 are isomorphic to GZ p,q for all p + q = n. This is depicted in Fig. 6.
Thus, computing these quotients increasing r ≥ 0 leads to Ker(d Tot ). With a little
more work, we can do the same for computing the homology.
The Mayer–Vietoris spectral sequence leads to Hn (S∗Tot ) after a series of small,
computable steps. This is done similarly as we did before for computing Ker(d Tot ).
However, in this case we need to take quotients by the images of d Tot . First, notice that
the vertical filtration F ∗ transfers to homology Hn (S∗Tot ) by the inclusions F p S∗Tot ⊆
S∗Tot for all p ≥ 0. That is, we have filtered sets
F p Hn S∗Tot := Im Hn F p S∗Tot −→ Hn S∗Tot ,
123
604 Discrete & Computational Geometry (2023) 70:580–619
1
Z2,2
S∗,∗ S∗,∗
2
Z3,1
F 2 S3Tot d
δ̄
d
δ̄
d GZ23,1 ∼ 2 /Z 1
= Z3,1 2,2
δ̄
d
which induce a filtration on Hn (S∗Tot ). For this filtration, the associated modules are
defined by the quotients G r Hn (S∗Tot ) = F r Hn (S∗Tot )/F r −1 Hn (S∗Tot ) for all r ≥ 0. As
we are working over a field, we recover the homology by taking direct sums:
n
Hn S∗Tot ∼= G r Hn S∗Tot . (8)
r =0
Previously, we defined the sets Z rp,q which are kernels “up to filtration”. In an analo-
gous way, we define boundaries “up to filtration” by setting3
for all r ≥ 0 and p, q ≥ 0. These are images of d Tot coming from a previous filtration.
In particular, the equalities d Tot (Z rp,q ) = B rp−r ,q+r −1 and d Tot (B rp,q ) = 0 hold.
Additionally, for all p, q ≥ 0, there is a sequence of inclusions,
Hence, we define the first page of the spectral sequence as the quotient
Z 1p,q GZ1p,q
E 1p,q := ∼
= ,
Z 0p−1,q+1 + B 0p,q Im(B 0p,q → GZ1p,q )
for all p, q ≥ 0. Recall that Ker(d) p,q = GZ1p,q = Z 1p,q /Z 0p−1,q+1 and also one
can see that Im (B 0p,q → GZ1p,q ) is isomorphic to Im(d) p,q . Then we deduce that
E 1p,q ∼
= Hq (S p,∗ , d). On this page d Tot induces differentials d 1 : E 1p,q → E 1p−1,q .
123
Discrete & Computational Geometry (2023) 70:580–619 605
GZ32,1 IB02,1
GZ22,1 IB12,1 2
E1,2
GZ12,1 IB22,1 2
E3,1
I.e., noticing that d Tot (Z 1p,q ) = B 1p−1,q ⊂ Z 1p−1,q and also d Tot (Z 0p−1,q+1 + B 0p,q )
= d Tot (Z 0p−1,q+1 ) + 0 = B 0p−1,q we have that d 1 : E 1p,q → E 1p−1,q is well defined.
Next, we compute the second page. First, notice that
Ker(d 1 ) Z 2p,q
E 2p,q := H p,q (E ∗,∗
1
, d 1) = = .
Im(d 1 ) Z 1p−1,q+1 + B 1p,q
The second page has differential d 2 induced by the total complex differential d Tot .
Figure 7 illustrates this principle. Doing the same for all pages we obtain the definition
of the r -page terms
r −1 r −1
Z rp,q
E rp,q := H p,q (E ∗,∗ ,d ) = −1 −1
,
Z rp−1,q+1 + B rp,q
GZrp,q
E rp,q := −1
.
Im (B rp,q → GZrp,q )
123
606 Discrete & Computational Geometry (2023) 70:580–619
Z∞ GK p,q
E∞
p,q ∼
p,q = = .
Z∞
p−1,q+1 + B∞
p,q Im (B ∞
p,q→ GK p,q )
since B ∞ ∞
p−1,q+1 ⊆ B p,q . Therefore, computing the spectral sequence is a way of
approximating the associated module G p Hn (S∗Tot ). By (8), adding up these leads to
Hn (S∗Tot ). Also, since E ∞ ∼ p ∗
p,q = G Hn (S∗ ), we say that E p,q converges to Hn (S )
Tot Tot
and we denote this as E ∗p,q ⇒ Hn (S Tot ). As the rows from S p,q are exact, the following
result is standard; see for example [3, Prop. 8.8] for a similar proof.
5 Persistent Mayer–Vietoris
One can translate the method from Sect. 4 to PMod. The reason for this is that PMod is
an abelian category, since Vect is an abelian category and R is a small category. The
theory of spectral sequences can be developed for arbitrary abelian categories. For an
introduction to this, see [30, Chap. 5].
Consider a filtered simplicial complex, K , together with a cover of K by filtered
subcomplexes, U = {Ui }i∈I , so that K = i∈I Ui . Then, the persistence Mayer–
Vietoris spectral sequence is given by
E 1p,q = PHq (Uσ ) ⇒ PHn (K ),
σ ∈Δmp
123
Discrete & Computational Geometry (2023) 70:580–619 607
U V U V U V
r =0 r = 0.5 r = 0.6
Fig. 8 As the radius increases, more edges are added. At radius r = 0.5 a circle will be across the two
covers U and V . Later on, at radius r = 0.6 this circle will be split into two
∞
E1,0
∞
E0,1
PH1 (K)
where p + q = n. However, unlike the case of vector spaces, we might have that
∞ ∞ ∼
p+q=n E p,q PHn (K ). All that we know is that E p,q = G PH p+q (K ) for all
p
as the extension problem, which we solve in Sect. 5.1. Furthermore, we even obtain
more information; as pointed out in [31], the knowledge of which subset J ⊂ I
detects a feature from PHn (K ) can potentially add insight into the information given
by ordinary persistent homology. The following example illustrates this.
Example 5.1 Consider the case of a point cloud X covered by two open sets as in
∞ )r associated
Fig. 8. From Sects. 3 and 4, we know how to compute the ∞-page (E ∗,∗
to any value r ∈ R. In particular, when we take r = 0.5, then the combination of
U and V detects a 1-cycle. On the other hand, when r = 0.6 this cycle splits into
two smaller cycles which are detected by U and V individually. Notice that if we
want to come up with a persistent Mayer–Vietoris method then we need to be able to
track this behaviour. That is, we need to know how cycles develop as r increases. In
∞ ∼ F
particular, the barcode from PH1 (X ) is broken down into E 1,0 = [0.5,0.6) and also
∞ ∼
E 0,1 = F[0.6,1.0) ⊕ F[0.6,1.0) , see Fig. 9.
Figure 10 illustrates a filtered complex covered by three regions where, as in Exam-
ple 5.1, there is a nontrivial extension problem.
Recall the definition of the total complex, vertical filtrations and associated mod-
ules from Sect. 4. Through this section we study the extension problem, that is, we
recover Hn (S∗Tot ) from the associated modules G p (Hn (S∗Tot )). Also, we assume that
123
608 Discrete & Computational Geometry (2023) 70:580–619
r =0 r ∼ 0.208 r = 0.5
Fig. 10 A one loop is detected at value r ∼ 0.208 which goes through three covers. Later, at radius r = 0.5,
this loop splits into three loops, each included in one of the three covers
the spectral sequence collapses after a finite number of pages. Consider the persistence
module
V = V(n) := Hn (S∗Tot ),
0 = F −1 V ⊂ F 0 V ⊂ . . . ⊂ F n V = V.
We define the associated modules of (V, F ∗ ) as the quotients Gk = F k V/F k−1 V for
all 0 ≤ k ≤ n. This gives rise to short exact sequences,
ι pk
0 F k−1 V FkV Gk 0, (9)
F k (r ) : Gk (r ) → F k V(r ), (10)
and there exists a sequence of βi ∈ Si,n−i (r ) such that d(βi ) = −δ̄(βi+1 ) for all
123
Discrete & Computational Geometry (2023) 70:580–619 609
where [ · ]Tot
n denotes the n-homology class of the total complex. Notice that if we
already computed G from the Mayer–Vietoris spectral sequence, then there is no need
to do any extra computations to obtain these morphisms F k (r ). All we need to do is
to store our previous results. Adding over all 0 ≤ k ≤ n we obtain the isomorphism
F(r ) = nk=0 F k (r ) : n
k=0 G (r ) → V(r ). This last morphism is an isomorphism
k
since all its summands are injective, their images have mutual trivial intersection, and
the dimensions of the domain and codomain coincide.
Recall that G has induced morphisms G(r ≤ s) from V(r ≤ s) for all values r ≤ s
in R. Given a basis G for G, we would like to compute a basis B for V from this
information. Notice that this is not a straightforward problem since (10) does not imply
that one has an isomorphism F : G → V. A point to start is to define the image along
each generator in G. That is, for each barcode generator gi ∼ [ai , bi ) in G, we choose an
image at the start F(ai )(gi (ai )). Then, we set F(r )(gi (r )) := V(ai < r )◦F(ai )(gi (ai ))
for all ai < r < bi . This leads to commutativity of F along each generator gi .
Nevertheless this is still far from even defining a morphism F : G → V.
The solution to the problem above is to define a new persistence module G. We
|G |
define G(s) := G(s) for all s ∈ R. Then, if G = {gi }i=1 is a barcode basis for G, by
for all s ∈ R. Now, given gi ∼ [ai , bi )
Proposition 3.2, G s (1F ) will be a basis of G(s)
a generator in G, we define the morphism G(r ≤ s) by the recursive formula
⎧ |G |
⎪
⎪
⎪
⎪ i ≤ s)(g j (bi )(1F )) if r ∈ [ai , bi ), bi ≤ s,
ci, j G(b
⎨
≤ s)(gi (r )(1F )) :=
G(r j=1
⎪
⎪ gi (s)(1F ) if r , s ∈ [ai , bi ),
⎪
⎪
⎩
0 otherwise,
is
where ci, j ∈ F for all 1 ≤ i, j ≤ |G|. We want to define ci, j in such a way that G
isomorphic to V. For this we impose the commutativity condition
i ≤ bi )(gi (ai )(1F )) = F(bi )−1 ◦ V(ai ≤ bi ) ◦ F(ai )(gi (ai )(1F )),
G(a
|G |
ci, j g j (bi )(1F ) = F(bi )−1 ◦ V(ai ≤ bi ) ◦ F(ai )(gi (ai )(1F )). (11)
j=1
This determines uniquely the coefficients ci, j for all 1 ≤ i, j ≤ |G|. Notice that
respects the filtration on V, since the right hand side in (11) is a composition of
G
then ci, j = 0 for
filtration preserving morphisms. In particular, if gi ∈ PVect (F k G),
all 1 ≤ j ≤ |G| such that g j ∈
/ PVect (F k G).
123
610 Discrete & Computational Geometry (2023) 70:580–619
Also, for all 0 ≤ q ≤ n we define the subset I q ⊆ {1, . . . , |G|} of indices 1 ≤ j ≤ |G|
such that g j ∈ PVect(Gq ). Then, the coefficients ci, j for j ∈ I k \ {i} are determined
by the following equality in Gk (bi ) (where we use p k from the sequence (9))
p k (bi ) [
gi (bi )(1F )]Tot
n = ci, j g j (bi )(1F ).
j∈I k \{i}
Thus, we have
⎛⎡ ⎤Tot ⎞
⎜ ⎟
p k (bi ) ⎝⎣
gi (bi )(1F ) − g j (bi )(1F )⎦ ⎠ = 0.
ci, j
j∈I k \{i} n
gi (bi )(1F ) − ci, j
g j (bi )(1F ) − d Tot Γ (12)
j∈I k \{i}
is contained in F k−1 SnTot (bi ). How do we compute Γ ? We start by searching for the
first page r ≥ 2 such that
⎡ ⎤r
⎣βki (bi )(1F ) − ci, j βk (bi )(1F )⎦
j
= 0, (13)
j∈I k \{i} k,n−k
where [ · ]rk,n−k denotes the class in the r -page in position (k, n − k). Notice that this
r must exist since we assumed that (13) vanishes at the ∞-page. Consequently, there
r −1
exists Γk+r −1 ∈ E k+r −1,n−k−r +2 (bi ) such that
⎡ ⎤r −1
⎣βki (bi )(1F ) − ci, j βk (bi )(1F )⎦
j
− d r −1 (Γk+r −1 ) = 0
j∈I k \{i} k,n−k
r −1
on E k,n−k (bi ). Repeating for all pages leads to Γk+t ∈ E k+t,n−k−t+1
t (bi ) for all
0 ≤ t ≤ r − 1, such that
r −1
d t
j
βki (bi )(1F ) − ci, j βk (bi )(1F ) − (Γk+t ) = 0, (14)
j∈I k \{i} t=0
123
Discrete & Computational Geometry (2023) 70:580–619 611
where d t(Γk+t ) ∈ Sk,n−k (bi ) is a representative for the class d t (Γk+t ) ∈ E k,n−k
t (bi ).
Notice that (14) holds independently of the representatives, since if we changed some
term, then the other representatives would adjust to the change. In particular, we have
that the k component of (12) vanishes, whereas the k − 1 component will be equal to
j
βk−1
i
(bi )(1F ) − ci, j βk−1 (bi )(1F ) − δ̄(Γk ).
j∈I k \{i}
Next we proceed to find coefficients ci, j ∈ F so that in Gk−1 (bi ) we get the equality
⎡ ⎤∞
⎣βk−1
i
(bi )(1F ) − ci, j βk−1 (bi )(1F ) − δ̄(Γk )⎦
j
Then we proceed as we did on Gk . Doing this for all parameters 0 ≤ r ≤ k, there are
coefficients ci, j ∈ F, and an element Γ ∈ Sn+1
Tot (b ) so that
i
gi (bi )(1F ) =
g j (bi )(1F ) + d Tot Γ.
ci, j
0≤r ≤k j∈I r
∼
Proposition 5.2 G = V.
Proof Since each F(s) is an isomorphism, and also we have commutative squares:
≤s)
G(r
)
G(r
G(s)
F (r ) F (s)
V(r ) V(s)
V(r ≤s)
This gives G= ∼ V, but we still need to compute a barcode basis. In fact, this can be
done by considering a quotient. Define A gi ∈G F[ai ,∞) where gi ∼ [ai , bi ) for
all gi ∈ G; here the A = {αi }1≤i≤|G | denotes the canonical base for A. Consider the
coefficients ci, j for 1 ≤ i, j ≤ |G| from the construction of G and define the sets of
123
612 Discrete & Computational Geometry (2023) 70:580–619
5.2 P ER M AV ISS
F p S Tot
E 0p,q =
p+q ∼
= S p,q = Sq (Uσ )
F p−1 S Tot
p+q σ ∈N pU
d 0p,q ∼
= dq : S p,q → S p,q−1 .
In particular, for each simplex σ ∈ NqU , the morphism d 0p,q restricts to a local differ-
ential
Thus, we can compute a basis, Eσ,q1 , for the persistent homology PH (U ) as well as a
q σ
basis for the image Im(dq+1σ ). Putting all of these together, we get a basis for E 1 as
p,q
1 =
the union E p,q U E 1 . Further, for each generator α ∈ E 1 ⊆ PVect(E 1 ),
σ ∈N p σ,q p,q p,q
123
Discrete & Computational Geometry (2023) 70:580–619 613
Z 1p,q
E 1p,q = .
Z 0p−1,q+1 + B 0p,q
$ %1
d 1p,q (α) = d Tot (0, . . . , 0, α p , 0, . . . , 0) p−1,q = [(0, . . . , 0, δ̄ p (α p ), 0, . . . , 0)]1p−1,q .
for Im((dq+1 )τ ) and write (δ̄ p (α p ))τ in terms of these; the preimages lead to Γτ .
U , we get a subset J 1
Repeating this for all τ ∈ N p−1 p−1,q ⊆ (E p−1,q ) together with
1 aα
coefficients cβ ∈ F\{0} for all β ∈ J p−1,q and an element Γ p−1 ∈ PVect(E 0p−1,q+1 )
1 1
so that
⎛ ⎞
⎜ ⎟
δ̄ p (α p ) = d 0p−1,q+1 (Γ p−1 ) 1aα ⎝
β∈
J 1
cβ1 β ⎠.
p−1,q
This leads to d 1p,q (α) = 1aα β∈J p−1,q
1 cβ1 β . Repeating this procedure for all gener-
ators α ∈ E p,q
1 leads to an associated matrix D 1 for d 1 . Using image_kernel,
p,q p,q
we compute bases for the kernel and image, together with the corresponding preim-
ages. Next, we compute a base E p,q 2 for the second page term E 2p,q by applying
box_gauss_reduce to compute the quotient Ker(d 1p,q )/Im(d 1p+1,q ). This also
leads to first page representatives α ∈ PVect(E 1p,q ) for all α ∈ E p,q 2 . Finally,
Ep,q
1 together with the computed coordinates of α in terms of E p,q 1 . This leads to
[α ]1p,q ∈ Ker(d 1p,q ), we might find α p−1 ∈ E 0p−1,q+1 such that d 0p−1,q (α p−1 ) =
123
614 Discrete & Computational Geometry (2023) 70:580–619
focus on the case that k < p + 1. Let α ∈ E p,q k with α ∼ [a , b ) together with a
α α
α ∈ Ep,q
representative k with α = (0, . . . , 0, α p−k+1 , . . . , α p , 0, . . . , 0) so that
$ %k
d k (α) = d Tot (
α ) p−k,q+k−1 = [(0, . . . , 0, δ̄ p−k+1 (α p−k+1 ), 0, . . . , 0)]kp−k,q+k−1 .
⎛ ⎞ ⎛ ⎞
⎜ ⎟
1aα ⎝
J
β∈
r −1
cβr −1 β ⎠ = d rp−k+r
−1
−1,q+k−r +1 (Γ p−k+r −1 ) 1aα
⎝
β∈J
r
cβr β ⎠.
p−k,q+k−1 p−k,q+k−1
proceed to obtain a ‘good’ total complex representative. There exists T (α) ⊆ E p,q k
$ %
together with coefficients cβα for all β ∈ T (α) such that α = 1aα β∈T (α) cβα β p,q .
k+1
Then we define α = 1aα β∈T (α) cα β , and notice that α = [ α ]k+1
p,q as well as
β
d Tot (
α ) ∈ F p−k (S Tot ). We denote
α = (0, . . . , 0, α p−k+1 , . . . , α p , 0, . . . , 0). Now,
by hypotheses
$ %k
d kp,q (α) = d Tot (
α ) p−k,q+k−1
= [(0, . . . , 0, δ̄ p−k+1 (α p−k+1 ), 0, . . . , 0)]kp−k,q+k−1 = 0.
p−1 $ %k−1
d p−1,q+1 (γ p−1 ) = d Tot (
α ) p−k,q+k−1 .
123
Discrete & Computational Geometry (2023) 70:580–619 615
k−1
By writing γ p−1 in terms of E p−1,q+1 and using their stored representatives, we may
get
γ p−1 ∈ S p+q such that γ p−1 = [
Tot γ p−1 ]k−1 and also d Tot (
γ p−1 ) ∈ F p−k S Tot
p+q−1 .
In particular,
$ Tot %k−1
d ( α (−
γk−1 )) =0
and we set α ← α (− γk−1 ). Hence, by induction, we can repeat this proce-
dure for all 1 ≤ r ≤ k. Eventually, we should obtain a representative α =
(0, . . . , 0, α p−k , . . . , α p , 0, . . . , 0) such that d Tot (
α ) ∈ F p−k−1 S Tot
p+q−1 . We denote
the new set of representatives as Ep,q k+1 .
After computing all pages of the spectral sequence, we still have to solve the exten-
sion problem. Recall that a solution was given in Sect. 5.1; here we only give some
algorithmic guidelines. We start from a basis E p,q ∞ , with total complex representa-
∞
tives E p,q . Since we assume that the spectral sequence is bounded, it collapses at an
L > 0 page. Then, for each generator α ∈ E p,q L , with α ∼ [a , b ), we have a corre-
α α
sponding representative
α ∈ E p,q . Consider
L γ ← 1bα (
α ); we perform changes to γ
similarly as in Sect. 5.1. We start by computing the classes [ γ ]rp,q for all 1 ≤ r ≤ L.
We do this by using Algorithm 1 in parallel, as done on the 1-page. This leads to a
1 ⊆ (E 1 )bα together with coefficients c1 ∈ F\{0} for all β ∈ J 1 and
subset J p,q β
p,q
p,q
γ ]0p,q = dq+1 (Γ p ) 1bα β∈J p,q
Γ p ∈ S p,q+1 , so that [ 1 cβ β . The same happens for
1
⎛ ⎞
−1
γ ]r −1 ) = d rp+r
1bα ([ −1,q−r +2 (Γ p+r −1 ) 1bα
⎝
J
β∈
c β ⎠.
r
p,q
r
β
γ ⎝−
γ ←
E
β∈
c
L
L ⎠
ββ .
p,q
123
616 Discrete & Computational Geometry (2023) 70:580–619
Let Ds be the maximum simplex dimension in K , and let L be the number of pages.
Denote by dim(N U ) the dimension of the nerve. Let
X = max {# q − simplices in Uσ }.
q≥0
σ ∈N U
we might select the columns associated to (E p,q1 )aα from the columns of E 1 which
p,q
does not affect the complexity of Algorithm 1. On the other hand, we need to execute
image_kernel on at most dim(N U ) · Ds elements on the first page. Notice that
for each of these, we first compute a basis for the images and kernels. Afterwards, we
perform the quotients using box_gauss_reduce which takes a complexity of at
most O(H 3 ). Also, we need to add the complexity of the Čech differential. An option
for computing this, is to compare simplices in different covers by their vertices; two
simplices are the same iff they share the same vertex set. This would take less than
O(|N U | Ds X 2 H ) operations. Thus the overall complexity becomes
& ' & '
|N U | dim(N U ) · Ds
O(X H ) +
2
O(|N U |Ds X 2 H ) + O(H 3 ) .
P P
k-Page Now, we proceed for the complexity of the page k ≥ 2. This is the same as
for the 1 page, with the addition of Gaussian eliminations of higher pages. These take
at most O(H 3 ) time. Denoting by L the infinity page, we have the new term
& '
dim(N U ) · Ds
O(L H 3 )
P
123
Discrete & Computational Geometry (2023) 70:580–619 617
Extension problem If the spectral sequence collapses at L > 0, then the complexity
of extending all generators in E p,q
L is bounded by that of computing the L page about
Ds times.
Overall complexity Altogether, we have a complexity bounded by that of computing
the first page plus that of computing the L page L + Ds times. Here the L comes from
computing the L page L times and Ds from the extension problem. Thus, the overall
complexity is bounded by
& ' & '
|N U | dim(N U ) · Ds
O(X 3 ) + (L + Ds ) O(|N U |Ds X 2 H ) + O(H 3 ) .
P P
Notice that in general Ds , L and dim(N U ) are much smaller than H and X . Thus, for
covers such that |N U | X , and assuming we have enough processors, the complexity
can be simplified to the two dominating terms O(X 3 ) + O(H 3 ). Notice that this last
case is satisfied for those covers whose mutual intersections are generally smaller than
each cover. Also, in this case H is approximately of the order of nontrivial barcodes
over all the input complex. This shows that PerMaViss isolates simplicial data, while
only merging homological information. It is worth to notice that in general H , being
the number of nontrivial bars, is much smaller than the size of the whole simplicial
complex.
6 Conclusion
We started by developing linear algebra for persistence modules. In doing so, we intro-
duced bases of persistence modules, as well as associated matrices to morphisms. Also,
we presented Algorithm 2, which computes bases for the image and the kernel of a
persistence morphism between any pair of tame persistence modules. Then a general-
ization of traditional persistent homology was introduced in Sect. 3.5. This theory has
helped us to define and understand the Persistent Mayer–Vietoris spectral sequence.
Furthermore, we have provided specific guidelines for a distributed algorithm, with
a solution to the extension problem presented in Sect. 5.1. The PerMaViss method
presented in Sect. 5.2 isolates simplicial information to local matrices, while merging
only homological information between different covers. Thus, the complexity of this
method is dominated by the size of a local complex plus the order of barcodes over
all the data. A first implementation of these results can be found in [29]. Coding an
efficient implementation from the pseudo-code given in this paper, and benchmark-
ing its performance compared to other methods, will be a matter of future research.
123
618 Discrete & Computational Geometry (2023) 70:580–619
Another interesting direction of research is how to merge this method with existing
algorithms, such as those from [7, 8, 21, 26]. Especially it would be interesting to
explore the possible interactions of discrete Morse theory and this approach, see [11].
Additionally, it will be worth exploring, both theoretically and practically, which are
the most suitable covers for different applications. Finally, we would also like to study
the additional information given by the covering. This will add locality information
from persistent homology.
Acknowledgements The author would like to thank his supervisor Dr. Ulrich Pennig who suggested this
topic and has been very helpful and supportive in the development of these ideas. Also, the author would
like to express his gratitude to EPSRC for the Grant EP/N509449/1 support with Project Number 1941653,
without which the author would not have been able to write this work.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which
permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give
appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence,
and indicate if changes were made. The images or other third party material in this article are included
in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If
material is not included in the article’s Creative Commons licence and your intended use is not permitted
by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the
copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
References
1. Adams, H., Emerson, T., Kirby, M., Neville, R., Peterson, C., Shipman, P., Chepushtanova, S., Han-
son, E., Motta, F., Ziegelmeier, L.: Persistence images: a stable vector representation of persistent
homology. J. Mach. Learn. Res. 18, # 8 (2017)
2. Bauer, U., Kerber, M., Reininghaus, J.: Clear and compress: computing persistent homology in chunks.
In: Topological Methods in Data Analysis and Visualization III (Davis 2013), pp. 103–117. Springer,
Berlin (2014)
3. Bott, R., Tu, L.W.: Differential Forms in Algebraic Topology. Graduate Texts in Mathematics, vol. 82.
Springer, New York (1982)
4. Bredon, G.E.: Sheaf Theory. Graduate Texts in Mathematics, vol. 170. Springer, New York (1997)
5. Carlsson, G.: Topology and data. Bull. Am. Math. Soc. 46(2), 255–308 (2009)
6. Chazal, F., de Silva, V., Glisse, M., Oudot, S.: The Structure and Stability of Persistence Modules.
SpringerBriefs in Mathematics. Springer, Cham (2016)
7. Chen, C., Kerber, M.: Persistent homology computation with a twist. In: 27th European Workshop on
Computational Geometry (Morschach 2011), pp. 197–200 (2011). https://eurocg11.inf.ethz.ch/docs/
Booklet.pdf
8. Chen, C., Kerber, M.: An output-sensitive algorithm for persistent homology. Comput. Geom. 46(4),
435–447 (2013)
9. Chow, T.Y.: You could have invented spectral sequences. Not. Am. Math. Soc. 53(1), 15–19 (2006)
10. Cohen-Steiner, D., Edelsbrunner, H., Harer, J., Morozov, D.: Persistent homology for kernels, images,
and cokernels. In: 20th Annual ACM-SIAM Symposium on Discrete Algorithms (New York 2009),
pp. 1011–1020. SIAM, Philadelphia (2009)
11. Curry, J., Ghrist, R., Nanda, V.: Discrete Morse theory for computing cellular sheaf cohomology.
Found. Comput. Math. 16(4), 875–897 (2016)
12. Delfinado, C.J.A., Edelsbrunner, H.: An incremental algorithm for Betti numbers of simplicial com-
plexes on the 3-sphere. Comput. Aided Geom. Des. 12(7), 771–784 (1995)
13. Di Fabio, B., Landi, C.: Persistent homology and partial similarity of shapes. Pattern Recognit. Lett.
33(11), 1445–1450 (2012)
123
Discrete & Computational Geometry (2023) 70:580–619 619
14. Edelsbrunner, H., Harer, J.L.: Computational Topology: An Introduction. American Mathematical
Society, Providence (2010)
15. Edelsbrunner, H., Letscher, D., Zomorodian, A.: Topological persistence and simplification. Discrete
Comput. Geom. 28(4), 511–533 (2002)
16. Ghrist, R.: Elementary Applied Topology. Createspace (2014)
17. Govc, D., Skraba, P.: An approximate nerve theorem. Found. Comput. Math. 18(5), 1245–1297 (2018)
18. Lewis, R., Morozov, D.: Parallel computation of persistent homology using the blowup complex. In:
27th ACM Symposium on Parallelism in Algorithms and Architectures (Portland 2015), pp. 323–331.
ACM, New York (2015)
19. Lipsky, D., Skraba, P., Vejdemo-Johansson, M.: A spectral sequence for parallelized persistence (2011).
arXiv:1112.1245
20. McCleary, J.: A User’s Guide to Spectral Sequences. Cambridge Studies in Advanced Mathematics,
vol. 58. Cambridge University Press, Cambridge (2001)
21. Milosavljević, N., Morozov, D., Škraba, P.: Zigzag persistent homology in matrix multiplication time.
In: 27th Annual Symposium on Computational Geometry (Paris 2011), pp. 216–225. ACM, New York
(2011)
22. Munkres, J.R.: Elements of Algebraic Topology. Addison-Wesley, San Francisco (2018)
23. Robins, V., Turner, K.: Principal component analysis of persistent homology rank functions with case
studies of spatial point patterns, sphere packing and colloids. Physica D 334, 99–117 (2016)
24. Robinson, M.: Topological Signal Processing. Mathematical Engineering. Springer, Heidelberg (2014)
25. de Silva, V., Ghrist, R.: Coverage in sensor networks via persistent homology. Algebr. Geom. Topol.
7(1), 339–358 (2007)
26. de Silva, V., Morozov, D., Vejdemo-Johansson, M.: Dualities in persistent (co)homology. Inverse Probl.
27(12), # 124003 (2011)
27. Singh, G., Mémoli, F., Carlsson, G.: Topological methods for the analysis of high dimensional data
sets and 3D object recognition. In: Eurographics Symposium on Point-Based Graphics (Prague 2007),
pp. 91–100. Eurographics Association (2007)
28. Skraba, P., Vejdemo-Johansson, M.: Persistence modules: algebra and algorithms (2013).
arXiv:1302.2015
29. Torras Casas, Á.: PerMaViss: persistence Mayer Vietoris spectral sequence (2020). https://doi.org/10.
5281/zenodo.3613870
30. Weibel, C.A.: An Introduction to Homological Algebra. Cambridge Studies in Advanced Mathematics,
vol. 38. Cambridge University Press, Cambridge (1994)
31. Yoon, H.R.: Cellular Sheaves and Cosheaves for Distributed Topological Data Analysis. PhD thesis,
University of Pennsylvania (2018). https://repository.upenn.edu/edissertations/2936
32. Yoon, H.R., Ghrist, R.: Persistence by parts: multiscale feature detection via distributed persistent
homology (2020). arXiv:2001.01623
33. Zomorodian, A., Carlsson, G.: Localized homology. Comput. Geom. 41(3), 126–148 (2008)
Publisher’s Note Springer Nature remains neutral with regard to jurisdictional claims in published maps
and institutional affiliations.
123