0% found this document useful (0 votes)
64 views40 pages

Distributed Algorithm For Computing Persistent Homology

This document presents a distributed algorithm for computing persistent homology using the Mayer–Vietoris spectral sequence. It introduces a method for obtaining images, kernels, and cokernels of persistence morphisms and discusses the PerMaViss algorithm, which efficiently computes the persistence Mayer–Vietoris spectral sequence while addressing the extension problem. The work emphasizes the importance of linear algebra in persistence modules and provides computational bounds to support the distribution of the method.

Uploaded by

vjrd fytug
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
64 views40 pages

Distributed Algorithm For Computing Persistent Homology

This document presents a distributed algorithm for computing persistent homology using the Mayer–Vietoris spectral sequence. It introduces a method for obtaining images, kernels, and cokernels of persistence morphisms and discusses the PerMaViss algorithm, which efficiently computes the persistence Mayer–Vietoris spectral sequence while addressing the extension problem. The work emphasizes the importance of linear algebra in persistence modules and provides computational bounds to support the distribution of the method.

Uploaded by

vjrd fytug
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 40

Discrete & Computational Geometry (2023) 70:580–619

https://doi.org/10.1007/s00454-023-00549-2

Distributing Persistent Homology via Spectral Sequences

Álvaro Torras-Casas1

Received: 11 August 2020 / Revised: 27 October 2022 / Accepted: 12 November 2022 /


Published online: 20 September 2023
© The Author(s) 2023

Abstract
We set up the theory for a distributed algorithm for computing persistent homology.
For this purpose we develop linear algebra of persistence modules. We present bases of
persistence modules, together with an operation  that leads to a method for obtaining
images, kernels and cokernels of tame persistence morphisms. Our focus is on devel-
oping efficient methods for the computation of homology of chains of persistence
modules. Later we give a brief, self-contained presentation of the Mayer–Vietoris
spectral sequence. Then we study the Persistent Mayer–Vietoris spectral sequence
and present a solution to the extension problem. This solution is given by finding
coefficients that indicate gluings between bars on the same dimension. Finally, we
review PerMaViss, an algorithm that computes all pages in the spectral sequence
and solves the extension problem. This procedure distributes computations on sub-
complexes, while focusing on merging homological information. Additionally, some
computational bounds are found which confirm the distribution of the method.

Keywords Spectral sequences · Distributed persistent homology · Mayer–Vietoris

Mathematics Subject Classification 55-04 · 55N35 · 55T99

1 Introduction

1.1 Motivation

Persistent homology has existed for about two decades [15]. This tool introduced the
field of Topological Data Analysis which, very soon, was applied to a multitude of

Editor in Charge: Kenneth Clarkson

Álvaro Torras-Casas
[email protected]

1 School of Mathematics, Cardiff University, Abacws, Senghennydd Road, Cardiff CF24 4AG, Wales,
UK

123
Discrete & Computational Geometry (2023) 70:580–619 581

problems, see [5, 16]. Among others, persistent homology has been applied to study
the geometric structure of sets of points lying in Rn [12, 15], coverage in sensor net-
works [25], pattern detection [23], classification and recovery of signals [24] and it
has also had an impact on shape recognition using machine learning techniques, see
[1, 13]. All these applications motivate the need for fast algorithms for computing
persistent homology. The usual algorithm used for these computations was introduced
in [15], with some later additions to speed up such as those of [7, 8, 26]. In [21] per-
sistent homology is proven to be computable in matrix multiplication time. However,
since these matrices become large very quickly, the computations are generally very
expensive, both in terms of computational time and in memory required.
In practice, computing the persistent homology of a given filtered complex is
equivalent to computing its matrices of differentials and perform successive Gaus-
sian eliminations; see [14, 15]. In recent years, some methods have been developed
for the parallelization of persistent homology. The first approach was introduced in
[14] as the spectral sequence algorithm, and was successfully implemented in [2].
This consists in dividing the original matrix M into groups of rows, and sending these
to different processors. These processors will, in turn, perform a local Gaussian Elimi-
nation and share the necessary information between them, see [2]. On the other hand, a
more topological approach is presented in [18]. It uses the blow-up complex introduced
in [33]. This approach first takes a cover C of a filtered simplicial complex K , and
uses the result that the persistent homology of K is isomorphic to that of the blow-up
complex K C . This proceeds by computing the sparsified persistent homology for each
cover, and then using this information to reduce the differential of K C efficiently. Both
of these parallelization methods have provided substantial speedups compared to the
standard method presented in [15].
The relation between homology classes and a cover is known as Localized Homol-
ogy, which was introduced in [33]. It would be useful to have a method that leads to
the speedups from [2, 18], while keeping Localized Homology information. Further,
such covers should have no restrictions, such as those used in the mapper algorithm,
see [27]. This last point limits substantially the use of the blowup-complex, since the
number of simplices increases very quickly when we allow the intersections to grow.
In fact, in the extreme case where a complex K is covered by n copies of K , the
blowup complex K C has size 2n |K |.

1.2 Literature Review

Since distribution is an important issue in persistent homology, it is worth exploring


which classical tools of algebraic topology could be used in this context. A well-known
tool for distributing homology computations is the Mayer–Vietoris spectral sequence
[3, Sect. 8], see [9] for a quick introduction to spectral sequences and [20] as a general
reference. Since the category of persistence modules and persistence morphisms is
an abelian category, the process of computing a spectral sequence should be more or
less straightforward. However, implementing this in practice is a difficult task. Fur-
thermore, this approach has been proposed in [19], although without a solution to the
extension problem. Later, spectral sequences were used for distributing computations

123
582 Discrete & Computational Geometry (2023) 70:580–619

of cohomology groups in [11], and recently in [31] and [32] spectral sequences are
used for distributing persistent homology computations. However, all of [11, 31, 32]
assume that the nerve of the cover is one-dimensional.
The first problem when dealing with spectral sequences is that we need to be able to
compute images, kernels and quotients as well as their representatives. This question
has already been studied in [10], where the authors give a very efficient algorithm.
However, there are couple of problems that come up when using [10] in spectral
sequences. First, in [10] the authors consider a persistence morphism induced by a
simplicial morphism f : X → Y . However, when working with the Mayer–Vietoris
spectral sequence we consider maps in the second, third and higher pages which are
not induced by a simplicial morphism at all. Furthermore, even when working with
the first page differentials, we cannot adapt the work from [10]; as a simplex from an
intersection is sent to several copies along lower degree intersections. Second, a key
assumption in [10] is that the filtrations in X and Y are both general in the sense that
a simplex in either X or Y is born at a time. However, in spectral sequences generality
hardly ever holds. Indeed, this follows from the fact that a simplex might be contained
in various overlapping covers.
Thus, if we want to compute images, kernels and cokernels, we will need to be
able to overcome these two difficulties first. Also, notice that a good solution should
lead to the representatives, as these are needed for the spectral sequence. It is worth
to mention that in [28] such images, kernels and cokernels where studied in terms
of Smith Normal Forms of presentations associated to persistence modules. Further,
as mentioned in [28], such work was developed with the aim of computing persis-
tence spectral sequences [19] in mind. However, adapting [28] to an algorithm which
computes spectral sequences is still a challenge.
The other difficulty with spectral sequences is the extension problem, which we
explain in Sect. 5.1. Within the context of persistent homology, the extension problem
first appeared in [17, Sect. 6]. There the authors give an approximate result that holds
in the case of acyclic coverings. This allows them to compare the persistent homology
to the lower row of the infinity page in the spectral sequence. This leads to an ε-
interleaving between the global persistent homology and that of the filtered nerve.
Later, the extension problem appeared in the PhD thesis of Yoon [31], and also in
the recent joint work with Ghrist [32]. In Sect. 4.2.3 from Yoon’s Thesis, there is a
detailed solution for the extension problem in the case when the nerve of the cover is
one-dimensional.

1.3 Original Contribution

In this paper, we set the theoretical foundations for a distributed method on the input
data. In order to do this, we use the algebraic power of the Mayer–Vietoris spectral
sequence. Since the aim is to build up an explicit algorithm, we need to develop
linear algebra of persistence modules, as done through Sect. 3. There, we define
barcode bases and also we develop an operation  that allows to determine whether
a set of barcode vectors is linearly independent or not. Using this machinery, we are
able to encapsulate all the information related to a persistence morphism in a matrix

123
Discrete & Computational Geometry (2023) 70:580–619 583

that depends on the choice of two barcode bases. This allows defining a Gaussian
elimination outlined in the box_gauss_reduce algorithm—see Algorithm 1—
which addresses the two issues raised above with regards to [10].
Next in Sect. 4, we give a detailed review of the Mayer–Vietoris spectral sequence
in the homology case. This is followed by Sect. 5.1, where we give a solution to the
extension problem. The solution is given by a careful consideration of the total complex
homology, together with the use of barcode basis machinery developed in Sect. 3.
In Sect. 5.2 we introduce PerMaViss, an algorithm for computing the persistence
Mayer–Vietoris spectral sequence and solving the extension problem. The advantage
of this procedure is that all the simplicial information is enclosed within local matrices.
This has one powerful consequence; this method consists in computing local Gaussian
eliminations plus computing image_kernel on matrices whose order is that of
homology classes. In particular, given enough processors and a ‘good’ cover of our
data, one has that the complexity is about

O(X 3 ) + O(H 3 ),

where X is the order of the maximal local complex and H is the overall number of
nontrivial persistence bars on the whole dataset.1 For more details on this, we refer
the reader to Sect. 5.4.
By using the ideas in this text we developed PerMaViss, a Python library that com-
putes the Persistence Mayer–Vietoris spectral sequence. In the results from [29], one
can see that nontrivial higher differentials come up and also the extension problem can-
not be solved in a trivial way in general. This supports the idea that the spectral sequence
adds more information on top of persistent homology. Finally, we outline future direc-
tions, both for the study of the Persistence Mayer–Vietoris spectral sequence and future
versions of PerMaViss.

2 Preliminaries

2.1 Simplicial Complexes

Definition 2.1 Given a set X , a simplicial complex K is a subset of the power set
K ⊆ P(X ) such that if σ ∈ K , then for all subsets τ ⊆ σ we have that τ ∈ K .
An element σ ∈ K will be called an n-simplex whenever |σ | = n + 1, whereas a
subset τ ⊆ σ will be called a face. Thus, if a simplex is contained in K all its faces
must also be contained in K . Given a simplicial complex K , we denote by K n the set
containing all the n-simplices from K . Given a pair of simplicial complexes K and L,
if L ⊆ K , then we say that L is a subcomplex of K . Also, given a mapping f : K → L
between two simplicial complexes K and L, we call f a simplicial morphism whenever

1 Notice that in a filtration indexed by integers where we introduce a simplex at a time there are no trivial
bars and the number of these is about (# simplices)/2. Here we are referring to the case where the filtration
is over the real numbers; where many simplices are introduced at the same time. In this case the number of
nontrivial persistent homology bars should be much smaller than the number of simplices.

123
584 Discrete & Computational Geometry (2023) 70:580–619

n
f (K n ) ⊆ l=0 L l for all n ≥ 0. The category composed of simplicial complexes and
simplicial morphisms will be denoted by SpCpx.

We represent the simplices from K as equivalence classes of tuples from X which are
equal up to even permutations, see [22, Sect. 5]. Let F be a field. For each n ≥ 0
we define the free vector space Sn (K ) := F[K n ]. We also consider linear maps
dn : Sn (K ) → Sn−1 (K ) usually called differentials, defined by


n
dn ([v0 , . . . , vn ]) = (−1)i [v0 , . . . , vi−1 , vi+1 , . . . , vn ]. (1)
i=0

Setting Sn (K ) = 0 for all n < 0, we put all of these in a sequence

0 d1 d2 d3
0 S0 (K ) S1 (K ) S2 (K ) ... (2)

It follows from formula (1) that the composition of two consecutive differentials van-
ishes: dn ◦ dn−1 = 0 for all n ≥ 0. In this case we say that (2) is a chain complex. As
a consequence, we have that Im(dn+1 ) ⊆ Ker(dn ), and we can define the homology
with coefficients in F to be Hn (K ; F) = Ker(dn )/Im(dn+1 ) for all n ≥ 0. In general,
F will be understood by the context and the notation Hn (K ) might be used instead.
On the other hand, we consider the augmentation map ε : S0 (K ) → F defined by
the assignment s → 1F , for any simplex s ∈ S0 (K ). Then, we define the reduced
homology by  H0 (K ; F) = Ker(ε)/Im(d1 ) and  Hn (K ; F) = Hn (K ; F) for all n > 0.

Consider S∗ (K ), obtained by augmenting (2) by ε and a copy of F in degree −1:

0 ε d1 d2 d3
0 F S0 (K ) S1 (K ) S2 (K ) ...

Then, computing homology on 


S∗ (K ) we obtain the reduced homology groups.

Definition 2.2 (standard m-simplex) Given m > 0, we define Δm = P ({0, 1, . . . , m}),


which will be called the standard m-simplex. This leads to a chain complex 
S∗ (Δm )

0 ε d1
0 F S0 (Δm ) S1 (Δm )
d2 d3 dn
S2 (Δm ) ... Sn (Δm ) 0.

By [22, Thm. 8.3], 


S∗ (Δm ) is exact, that is, 
Hn (Δm ) = 0 for all n ≥ 0.

finite set U = {Ui }i=0


Definition 2.3 Let K be a simplicial complex. A  m of subcom-
m
plexes from K , is a cover of K whenever K = i=0 Ui . Let σ ∈ Δm and denote
Uσ = i∈σ Ui . The nerve of U is defined by

N U = {σ : Uσ = ∅} ⊆ Δm .

123
Discrete & Computational Geometry (2023) 70:580–619 585

In particular, given a simplex σ ∈ N U , there is an injection f σ : Δ|σ | → N U which


induces an injection of chain complexes f ∗σ :  S∗ (Δ|σ | ) → 
S∗ (N U ) whose image
σ  |σ |
f ∗ ( S∗ (Δ )) is exact.

Definition 2.4 (Čech chain complex) Let K be a simplicial complex and let U =
{Ui }i=0
m be a cover of K . Given s ∈ K , there exists σ (s) ∈ N U with maximal
cardinality |σ (s)|, so that s ∈ Uσ (s) . Then, we define the (n, U)-Čech chain complex
by

   |σ (s)| 
Č∗ (n, U; F) = f ∗σ (s) 
S∗ Δ .
s∈K n

For k ≥ −1, we use the notation (τ )s with s ∈ K n and τ ∈  Sk (Δ|σ (s)| ), to denote an
element in Čk (n, U; F) that is zero everywhere except for τ in the component indexed
by s. Then the image of the k-Čech differential is defined by the assignment δ̌kU ((τ )s ) =
U U
(d N τ )s , where d N denotes the kth differential of 
k k S∗ (N U ). By definition, the Čech
complex is a chain complex and is exact. Also, Č−1 (n, U; F) Sn (K ) follows easily.
On the other hand, for each k ≥ 0 we define an isomorphism


ψk : Čk (n, U; F) Sn (Uσ )
σ ∈NkU

σ (s)
by sending (τ )s to (s)τ for any pair of simplices s ∈ K n and τ ∈ f k Δ|σ (s)| . In
particular, we can rewrite the (n, U)-Čech chain complex as a sequence

δ0 δ1 δ2
0 Sn (K ) Sn (Uσ ) Sn (Uσ ) Sn (Uσ ) . . . (3)
σ ∈Δm
0 σ ∈Δm
1 σ ∈Δm
2

where, for any pair σ ∈ NkU and s ∈ (Uσ )n , the differentials δi are defined as follows:

 U    U 
δk ((s)σ ) = ψk ◦ δ̌k ◦ ψk−1 ((s)σ ) = ψk dkN σ s = dkN (σ ) τ · s τ ∈N U ,
k−1

U U U .
and where {dkN (σ )}τ ∈ F is the coefficient of dkN (σ ) in the simplex τ ∈ Nk−1

Remark 2.5 Alternatively, the Čech chain complex can be defined straight away as the
sequence (3), and prove exactness by using cosheaf theory. Namely, given a simplicial
complex K , we consider the topology where the open sets are subcomplexes. Then,
for each integer n ≥ 0, consider the simplicial precosheaf as an assignment Sn : V →
Sn (V ) for each subcomplex V ⊆ K . This precosheaf is in fact a flabby cosheaf. Then,
using 2.5, 4.3, and 4.4 from [4, Sect. VI], one has exactness of the Čech chain complex.

123
586 Discrete & Computational Geometry (2023) 70:580–619

2.2 Persistence Modules

Let R be the category of real numbers as a poset, where homR (s, t) contains a single
morphism whenever s ≤ t, and is empty otherwise. Let F be a field and let Vect denote
the category of F-vector spaces.

Definition 2.6 A filtered simplicial complex is a functor K : R → SpCpx, with K s ⊆


K t for all s ≤ t. Define PHn (K ), the n-persistent homology of K as the composition
Hn (K ) : R → Vect.

Definition 2.7 A persistence module V is a functor V : R → Vect. That is, to any


r ∈ R, V assigns a vector space in Vect which is denoted either by V(r ) or Vr .
Additionally, to any pair s ≤ t, there is a linear morphism V(s ≤ t) : Vs → Vt .
These morphisms satisfy V(s ≤ s) = IdVs for any s ∈ R and the relation V(r ≤ t) =
V(s ≤ t) ◦ V(r ≤ s) for all r ≤ s ≤ t in R. Given two persistence modules V and W,
a persistence morphism is a natural transformation f : V → W. Thus, for any pair
s ≤ t, there is a commutative square

V(s≤t)
Vs Vt
fs ft
W(s≤t)
Ws Wt .

We denote by PMod the category of persistence modules and persistence morphisms.

By naturality of f we refer to the commutative square above. A persistence morphism


f : V → W is an isomorphism whenever f t is an isomorphism for all t ∈ R, which
we denote by V W.

Example 2.8 Let s ≤ t from R. We define the interval module F[s,t) by F[s,t) (r ) = F
for all r ∈ [s, t) and F[s,t) (r ) = 0 otherwise. The morphisms F[s,t) (a ≤ b) are the
identity for any two a, b ∈ [s, t) and 0 otherwise.

Given F[s,t) , s and t are the birth and death values respectively. If V(r ) is finite
dimensional for all r ∈ R, then there is an isomorphism V i∈J F[si ,ti ) , as shown
in [6]. This is the barcode decomposition of V.

Definition 2.9 V ∈ PMod is tame iff it has a finite barcode decomposition: V


N
i=1 F[ai ,bi ) .

Throughout this work all persistence modules are assumed to be tame.

Definition 2.10 A chain of persistence modules is a sequence of persistence modules


Vk and differentials δk : Vk → Vk−1 such that δk−1 ◦ δk = 0 for all k ∈ Z. Elements
from PVect(Vk ), for all k ∈ Z, are sometimes called chains.

123
Discrete & Computational Geometry (2023) 70:580–619 587

3 Homology of Persistence Modules

3.1 Barcode Bases

In this section, we introduce barcode bases and the operation . This framework
allows to introduce the matrix associated to a persistence morphism f : V → W and
a choice of bases. This theory is applied to develop algorithms for computing images
and kernels of persistence morphisms as well as quotients of persistence modules.
In addition, we evaluate the respective computational complexities. Additionally, we
illustrate how to compute homology in the category of persistence modules.
Definition 3.1 (barcode basis) A barcode basis A of a tame persistence module V
N
is a choice of an isomorphism, α : i=1 F[ai ,bi ) → V. A direct summand of α is a
restricted morphism αi : F[ai ,bi ) → V which we call a barcode generator. Often, we
denote a barcode basis A by the set of barcode generators A = {αi }i=1 N .

Within the context of Definition 3.1, we make some notational remarks.


• Given a barcode generator αi ∈ A, we write αi ∼ [ai , bi ) to indicate that the
domain of αi is F[ai ,bi ) and say that αi is associated to the interval [ai , bi ).
• Given αi ∈ A with αi ∼ [ai , bi ), we have linear transformations αi (r ) : F[ai ,bi ) (r )
→ V(r ) for all r ∈ R. In particular, since F[ai ,bi ) (r ) is either 0 or F, αi (r ) is
uniquely determined by αi (r )(1F ) ∈ V(r ) for r ∈ [ai , bi ). In addition, notice that
αi (r ) = 0 for all r ∈ [ai , bi ) since otherwise α would not be injective.
• For any given r ∈ R, we define a subset of A,

Ar = {αi : 1 ≤ i ≤ N , αi (r ) = 0}.

In this case, if αi ∈ Ar and αi ∼ [ai , bi ), then ai ≤ r < bi by naturality of αi .


Also, evaluating all the elements from Ar on 1F leads to a basis Ar (1F ) for V(r ),
where

Ar (1F ) = {αi (r )(1F ) : αi ∈ Ar }.


N
Proposition 3.2 Given a persistence morphism α : i=1 F[ai ,bi ) → V, consider A =
{αi }i=1 . Then, A is a barcode basis for V if and only if Ar (1F ) is a basis for V(r ) for
N

all r ∈ R.
Proof Since Vect F is an abelian category and R is a small category, α is an isomorphism
if and only if α(r ) is an isomorphism for all r ∈ R. That is, the kernel Ker(α) →
N
i=1 F[ai ,bi ) vanishes iff Ker(α)(r ) = 0 for all r ∈ R. A similar argument is done
for surjectivity. Then, α(r ) is an isomorphism iff Ar (1F ) is a base for V(r ) and the
result follows. 

Next, we use barcode bases to understand persistence morphisms f : V → W. In
particular, fixing a pair of barcode bases A and B for V and W respectively, we show
that there is a unique matrix F associated to f , see Corollary 3.12. However, this
makes no sense unless we are able to perform additions on barcode generators, which,
as shown in the following example, cannot be done in a straight-away manner.

123
588 Discrete & Computational Geometry (2023) 70:580–619

Example 3.3 Consider V F[0,2) ⊕ F[1,3) together with the canonical basis A given
by generators α1 ∼ [0, 2) and α2 ∼ [1, 3). Then, it is not possible to add α1 and α2
since at some points the domains differ; for example at filtration value 0 we have that
α1 (0) has F as domain but α2 (0) has 0 as domain.
To fix this, consider the following set of pairs:

PVect(V) = (γ , (aγ , bγ )) | γ : F[aγ ,bγ ) → V where γ (r ) = 0 iff r ∈ [aγ , bγ ) .

Given (γ , (aγ , bγ )) ∈ PVect(V), we call the first component, γ , a persistence vector


and say that γ is associated to the pair (aγ , bγ ), which we denote as γ ∼ [aγ , bγ ).
Notice that in our definition of PVect(V) we include (Z r , (r , r )) where Z r : 0 → V
is the zero morphism; here we can distinguish Z r and Z s since they are associated to
different pairs for r = s. We define Z ⊆ PVect(V) to be the subset of zero element
pairs (Z r , (r , r )) for all r ∈ R. We show that PVect(V) has many properties analogous
to those of vector spaces.
Definition 3.4 (barcode sum) We define the barcode sum as the assignment

 : PVect(V) × PVect(V) −→ PVect(V)

which sends ((γ , (aγ , bγ )), (τ, (aτ , bτ ))) to the pair (γ  τ, (max (aγ , aτ ), Bγ τ ))
where we set

min (bγ , bτ ) if bγ = bτ ,
Sγ τ =
sup {r ∈ [max(aγ , aτ ), bγ ) : γ (r ) + τ (r ) = 0} if bγ = bτ ,

and

max(bγ , bτ ) if bγ = bτ ,
Bγ τ =
Sγ τ if bγ = bτ .

We define γ  τ for each r ∈ [max (aγ , aτ ), Bγ τ ) by




⎨γ (r ) + τ (r ) for r ∈ [max(aγ , aτ ), Sγ τ ),
γ  τ (r ) = γ (r ) for r ∈ [bτ , bγ ) if bτ < bγ ,


τ (r ) for r ∈ [bγ , bτ ) if bγ < bτ .

One can check that γ  τ : F[max(aγ ,aτ ),Bγ τ ) → V is a well-defined persistence mor-
phism.
By definition,  is commutative and γ  τ (r ) = 0 iff r ∈ [max(aγ , aτ ), Bγ τ ). For
brevity, given (γ , (aγ , bγ )) ∈ PVect(V), we refer only to the first component γ .
By abuse of notation, we say “given a persistence vector γ ∈ PVect(V)” or “given
a subset of persistence vectors S ⊆ PVect(V)”. Also, elements Z r ∈ Z behave
nontrivially with respect to ; for example, given a persistence vector γ ∼ [aγ , bγ )
and considering c > bγ we have that γ  Z c = Z c .

123
Discrete & Computational Geometry (2023) 70:580–619 589

Proposition 3.5  is associative in PVect(V).

Proof Consider three persistence vectors γ ∼ [aγ , bγ ), τ ∼ [aτ , bτ ), and ρ ∼


[aρ , bρ ) from PVect(V). We show L = R for L:=(γ  τ )  ρ and R:=γ  (τ  ρ).
Notice that both L and R will share the same startpoint A = max (aγ , aτ , aρ ); thus
L and R are associated to a pair of intervals [A, B L ) and [A, B R ) respectively. Addi-
tionally, we have that L(A) = γ (A) + τ (A) + ρ(A) = R(A). Thus, by naturality, it
follows that

L(r ) = V(A ≤ r )(L(A)) = V(A ≤ r )(R(A)) = R(r ),

for all r ∈ R with A ≤ r . Since L(r ) = 0 iff r ∈ [A, B L ) and R(r ) = 0 iff r ∈ [A, B R )
we must have B L = B R and the equality L = R holds. 


Definition 3.6 (scalar multiplication) Define λ : F × PVect(V) → PVect(V) to send


(c, γ ), with γ ∼ [aγ , bγ ), to either cγ , if c = 0 (where cγ (r ) = c · γ (r ) for all
r ∈ R), or to Z aγ , if c = 0.

Now we are ready to introduce the key for characterizing barcode bases.

Definition 3.7 Let T ⊆ PVect(V) \ Z. We say that T is linearly independent or


the elements from T are linearly independent iff for any nonempty subset S ⊆ T
and any coefficients kγ ∈ F \ {0} with γ ∈ S, the sum γ ∈S kγ γ is associated to
[maxγ ∈S (aγ ), maxγ ∈S (bγ )), where γ ∼ [aγ , bγ ) for all γ ∈ S.

Example 3.8 Suppose that {α1 ∼ [0, 2), α2 ∼ [0, 1)} is a barcode basis of V
F[0,2) ⊕ F[0,1) . Then, {α1 , α2 } is linearly independent as will follow from Propo-
sition 3.11. On the contrary, α1 and α1  α2 are not linearly independent since
(−α1 )  (α1  α2 ) = α2 is associated to [0, 1) but −α1 ∼ [0, 2) and α1  α2 ∼ [0, 2).

In Proposition 3.11 we show that a barcode base is linearly independent. However,


we would like that a barcode base also generates the set PVect(V). For this, we need
to introduce a further ingredient.

Definition 3.9 (barcode cuts) Let s ∈ R. We define 1s : PVect(V) → PVect(V) as


1s (α) = α  Z s , for all α ∈ PVect(V).

Notice that λ and {1s }s∈R are compatible, in the sense that 1r (cγ ) = c1r (γ ) for
all γ ∈ PVect(V), all r ∈ R and all c ∈ F. Also, given γ , τ ∈ PVect(V) and
s, r ∈ R, it follows that 1s (γ
 )  1r (τ ) = 1max(s,r ) (γ  τ ). Thus, persistence vectors
on V correspond to a tuple PVect(V),  , λ, {1s }s∈R .

Definition 3.10 Given T ⊆ PVect(V), we say that T generates PVect(V) iff for any
γ ∈ PVect(V)\Z, there exists S ⊆ T together with coefficients kγ ∈ F\{0} for all
γ ∈ S, and some s ∈ R such that
 
α = 1s k γ
γ ∈S
γ .

123
590 Discrete & Computational Geometry (2023) 70:580–619

Proposition 3.11 A ⊆ PVect(V) is a barcode basis for V iff it generates PVect(V)


and is linearly independent.
Proof Assume first that A is a barcode basis. By Proposition 3.2, Ar (1F ) is a basis
for V(r ) for all r ∈  F. Then, for any γ ∈ PVect(V) with γ ∼ [aγ , bγ ), we must
have γ (aγ )(1F ) = α∈Aaγ kα α(aγ )(1F ) for some coefficients kα ∈ F and all
α ∈ Aaγ . By naturality of γ , this implies γ = 1aγ α∈Aaγ kα α , which proves
that A generates PVect(V). On the other hand, assume that A is not linearly inde-
pendent. Then there exist some nonempty subset S ⊆ A together with coefficients
kα ∈ F\{0} such that α∈S kα α is associated to an interval [maxα∈S (aα ), B) with
B < maxα∈S (bα ). However, this implies that A B (1F ) is not linearly independent in
V(B), since α∈S ∩A B kα α(B)(1F ) = 0, but kα ∈ F\{0} for all α ∈ S ∩ A B = ∅,
reaching a contradiction. Thus, A must be linearly independent.
Now, suppose that A generates PVect(V) and is linearly independent. We prove that
A is a barcode base by using Proposition 3.2. That is, we only need to show that Ar (1F )
is a basis for V(r ) for all r ∈ R. Thus, let us show that Ar (1F ) generates V(r ) for all
r ∈ R. Let g ∈ V(r ) with g = 0 and define the persistence vector γ : F[r ,s) → V
by setting γ (r )(1F ) = g, where s = sup {a : V(r ≤ a)(g) = 0}. Thus, by generation
of A, there exists
 some subset S ⊆ A together with somecoefficients kα ∈ F such
that γ = 1r α∈S kα α . In particular, g = γ (r )(1F ) = α∈S kα α(r )(1F ) and the
claim follows. To show that Ar (1F ) is linearly independent, we consider any non-
empty subset S ⊆ Ar together with coefficients kα ∈ F\{0} for all α ∈ S. Then
Γ = α∈S kα α is associated
 to [maxα∈S (aα ), maxα∈S (bα )) which must contain r ,
and so Γ (r )(1F ) = α∈S kα α(r )(1F ) = 0. Altogether Ar (1F ) is a basis for V(r ).  
Let f : V → W be a persistence morphism and consider two bases A and B
for V and W respectively. Given γ ∈ PVect(V) such that γ ∼ [aγ , bγ ), we
define f (γ ) as the persistence vector f (γ ) : F[aγ ,b f (γ ) ) → W where b f (γ ) =
sup {r ∈ [aγ , bγ ) : f (r ) ◦ γ (r ) = 0} and f (γ )(r ) := f (r )◦γ (r ) for all r ∈ [aγ , b f (γ ) ).
Now, for each α ∈ A with α ∼ [aα , bα ), as B is a barcode base, there exist some
subset S ⊆ B together with coefficients kβ,α ∈ F \ {0} for all β ∈ S such that
 
f (α) = 1aα k
β∈S
β,α β .

It follows that S ⊂ Baα since adding elements from B \ Baα would have no effect or
would cut the startpoint to a value greater than aα . Also, notice that if β(bα ) = 0 then
β ∈ / S, since otherwise f would not be natural as a persistence morphism. Thus, S
must be a subset of

B(α) := β : β ∈ B, β(aα ) = 0, β(bα ) = 0 ⊆ Baα ⊆ B.

By pointwise-linearity and naturality of f , for any S ⊆ A the equality


 
f k α
α∈S
α = k
α∈S
α f (α)

123
Discrete & Computational Geometry (2023) 70:580–619 591

holds, where kα ∈ F for all α ∈ S.

Corollary 3.12 Let V and W be a pair of persistence modules together with their
respective barcode bases A and B. Given a persistence morphism f : V → W, there
is a unique associated matrix F = (kβ,α )β∈B,α∈A which is well defined in the sense
that whenever kβα = 0 then β ∈ B(α). Conversely, assume that F is well defined,
then there exists a unique persistence morphism f : V → W whose associated matrix
is F.

Proof By the reasoning above, we only need to prove the converse


 statement.
 First, for
each α ∈ A such that α ∼ [aα , bα ), we define f (α) := 1aα β∈B(α) kβ,α β . By linear
independence of B, f (α) is associated to the interval [aα , B) for B = maxβ∈S (bβ )
 : kβ,α = 0}. We can extend the definition of f by the
with S being the set {β ∈ B(α)
linear formula f α∈A cα α = α∈A cα f (α) for any coefficients cα ∈ F for all
α ∈ A. This implies the claim as f is then natural and pointwise linear. 


We end this section by introducing different orders of barcode bases. These orders are
important to introduce Gaussian eliminations in the barcode basis context.

Definition 3.13 Let V be a persistence module with barcode base A. Also, let αi , α j ∈
A with αi ∼ [ai , bi ) and α j ∼ [a j , b j ). We consider two orders in A:
The standard order: αi < α j if either ai < a j or ai = a j and bi > b j .
The endpoint order: αi < α j if either bi < b j or bi = b j and ai < a j .
If V is tame, it is straightforward to extend these orders to total orders for A.

3.2 Computing Kernels and Images

Consider two finite barcode bases A = {αi }i=1 n and B = {β j }mj=1 for V and W,
respectively. Additionally, suppose that A is ordered according to the standard order
while B is ordered using the endpoint order. We assume such orders are total; e.g.,
even if αr , αs ∼ [a, b) for r = s, either αr < αs or αs < αr holds. Then, we consider
f (A)B = ( f (α1 ), . . . , f (αn )), the matrix of f in the bases A and B. In Sect. 3.3, we
transform f (A)B performing left to right column additions until obtaining a reduced
matrix, i.e., with unique column pivots,
    
   n−1
  
IB = f (α1 )  f (α2 )  k2,1 f (α1 )  . . .
 
 f (αn ) 
 
j=1
kn, j f (α j ) (4)

for suitable ki, j ∈ F and 1 ≤ j < i ≤ n. This IB has the property that its non-zero
columns form a basis  I for Im( f ).

Definition 3.14 Given S ⊆ B, consider a vector V = (kβ )β∈B such that kβ = 0 iff
β ∈ S. The pivot of V is the greatest element from S in the endpoint order. We also
refer to the pivot of S or the pivot of β∈S kβ β.

123
592 Discrete & Computational Geometry (2023) 70:580–619

α1 α3 β3
5
α2 β2
4
β1
3
2
1
0
Ker(f ) V Im(f ) W

Fig. 1 Decomposition of barcodes in image, kernel, domain and codomain of f : V → W. The colors
correspond to the different generators associated to  
I and K

Consider again the matrix IB from (4). By linearity, the jth column from I is
 
f α j  i=1 k j,i αi ; thus, its preimage is p j = α j  i=1 k j,i αi and we define the
j−1 j−1

set of preimages of I by PI = { p j } j=1 . Given p j ∈ PI, notice that p j ∼ [a j , b j )


n

while f ( p j ) ∼ [a j , c j ) for filtration values c j ≤ b j . Thus, we must have that


1c j ( p j ) ∈ ker( f ) for all 1 ≤ j ≤ n. We consider GK = {1c j ( p j )}1≤ j≤n which
generates ker( f ), as it is shown later in Proposition 3.17. Then, we order GK
by choosing a permutation σ : {1, 2, . . . , n} → {1, 2, . . . , n} such that it is con-
sistent with  the standard order. Using the  order from GK we consider the matrix
GKA = 1cσ (1) ( pσ (1) ), . . . , 1cσ (n) ( pσ (n) ) where the rows correspond to generators
from A with the endpoint order. Reducing columns we find some coefficients qi, j ∈ F
so that the resulting matrix has unique pivots:
     
       n−1
KA = 1cσ (1)
   
pσ (1)  1cσ (2) pσ (1)  q2,2 pσ (2)  . . .  1cσ (n) pσ (n) 
i=1
q j,i pσ (i) .

 for Ker( f ). In the following


Taking the non-zero columns from KA leads to a basis K
we present an algorithm obtaining such bases for Im( f ) and ker( f ). First we go
through an illustrative example:

Example 3.15 Consider two persistence modules (see Fig. 1)

V F[1,5) ⊕ F[1,4) ⊕ F[2,5) , W F[0,3) ⊕ F[1,4) ⊕F[0,5) ,

with canonical barcode bases (α1 , α2 , α3 ) and (β1 , β2 , β3 ) ordered respectively using
the standard and endpoint orders. Let f : V → W be given by the |B| × |A| matrix
f (A)B :
⎛ ⎞ ⎛ ⎞
α1 α2 α3 α1 α2  (−α1 ) α3
⎜ β1 1 0 0 ⎟ ⎜ β1 1 −1 0 ⎟
f (A)B = ⎜
⎝ β2 1 1 1 ⎠
⎟ IB = ⎜ ⎝ β2 1
⎟ .
0 1⎠
β3 0 0 1 β3 0 0 1

Then notice that the first two columns from f (A)B share the same pivot β2 , while
the third’s column pivot is β3 . We subtract the first column to the second, leading to

123
Discrete & Computational Geometry (2023) 70:580–619 593

the matrix IB above which has unique pivots for each column. From IB we obtain

f (α1 ) = 11 (β1  β2 ), f (α2  (−α1 )) = −11 (β1 ), f (α3 ) = 12 (β2  β3 ).

In particular, we obtain a basis for the image


I = 11 (β1  β2 ), −11 (β1 ), 12 (β2  β3 ) ,

which leads to the barcode decomposition Im( f ) F[1,4) ⊕ F[1,3) ⊕ F[2,5) . At the
same time, we obtain a corresponding set of preimages PI = {α1 , α2  (−α1 ), α3 }
From this we deduce the set of kernel generators GK and order it by the standard
barcode order GK = {13 (α2  (−α1 )), 14 (α1 ), 15 (α3 )}. Thus, we consider the matrix
GKA for the kernels,where the rows correspond to the endpoint order on A, and reduce
it:
⎛ ⎞
13 (α2  (−α1 )) 14 (α1 ) 15 (α3 )
⎜ α2 1 0 0⎟
GKA = ⎜ ⎝ α1

−1 1 0⎠
α3 0 0 1
⎛ ⎞
13 (α2  (−α1 )) 14 (α2 ) 15 (α3 )
⎜ α2 1 1 0⎟
KA = ⎜ ⎝ α1

−1 0 0⎠
α3 0 0 1

Notice that the second and third columns from K are trivial since 14 (α2 ) = Z 4 and
15 (α3 ) = Z 5 . Since K contains a single nontrivial element we have obtained a basis
for the kernel K  = {13 (α2  (−α1 ))}. Thus, ker( f ) F[3,5) .

Finally, we explain why uniqueness of pivots leads to linear independence.

Proposition 3.16 Let V be a persistence module with a barcode basis A ordered


using the endpoint order. Consider a subset of persistence vectors M ⊂ PVect(V)
and suppose that their pivots on A are all different. Then M is linearly independent.
 
Proof Let m ∈ M, m ∼ [am , bm ), and write it in terms of A as m = 1am α∈Sm kα α ,
where Sm ⊆ A and kα ∈ F\{0} for all α ∈ Sm . Let αm ∈ Sm be the pivot of m. In
particular, by the definition of pivot we must have αm ∼ [cm , bm ) for some filtration
value cm ≤ am . Now, consider a nonempty subset R ⊆ M together with coefficients
qm ∈ F\{0} for all m ∈ R and take the sum V = m∈R qm m. We claim that
V ∼ [maxm∈R (am ), maxm∈R (bm )). Consider the element P ∈ R whose pivot α P
is the highest according to the endpoint order. Consequently, we have that b P =
maxm∈R (bm ). Notice that P is unique, since otherwise there would be two elements
from M with the same pivots, contradicting our assumption. This implies that V is
written in terms of A with a nonzero coefficient for α P . By linear independence of A
the claim follows. 


123
594 Discrete & Computational Geometry (2023) 70:580–619

3.3 The box_gauss_reduce Algorithm

Suppose that f : V → W is a morphism between two tame persistence modules. Let A,


with the standard order, and B, with the endpoint order, be barcode bases for V and W
respectively. Suppose also that we know f (A)B , the matrix associated to f with respect
to A and B. Sending this information to Algorithm 1, called box_gauss_reduce2 ,
we obtain a set of persistence vectors I ⊆ PVect ( f (V)) together with the coordinates,
in A, of their respective preimages stored in PIA . Then, using PIA , we obtain a set
of generators GK for ker( f ), order it according to the standard order and define a
matrix GKA which expresses the elements from GK in terms of A. Reducing GKA
by using Algorithm 1 again, we end up with K. Taking the nonzero elements from I
and K leads to the barcode bases   An outline of this procedure is shown in
I and K.
Algorithm 2, which we call image_kernel.

Algorithm 1 box_gauss_reduce
Input: A, B, f (A)B , where B follows the endpoint order
Output: I, PIA .
 
Id|A|
1: Set M = and store pivots of columns in a list called lpivots.
f (A)B
2: With the same order, store birth values of elements from A into lbirths.
3: With the same order, store death values of elements from B into ldeaths.
4: for p = |A| + |B| ≥ . . . ≥ |A| + 1, with p decreasing, do
5: Find all 1 ≤ i ≤ |A| such that lpivots[i] == p; store into a new ordered list called lpiv_idx.
6: while L = length(lpiv_idx) > 1 do
7: Find 1 ≤ m < L such that lbirths[lpiv_idx[m]] is minimal. For multiple choices, take the
smallest m.
8: for j = L ≥ . . . ≥ m + 1 do
9: idx ← lpiv_idx.pop()
10: lbirths[idx] ← max(lbirths[idx],
 lbirths[lpiv_idx[m]])

M[ p, idx]
11: M[:, idx] ← M[:, idx] − M[:, lpiv_idx[m]]
M[ p, lpiv_idx[m]]
12: for k = |A| + 1 ≤ . . . ≤ p do
13: If ldeaths[k] <= lbirths[idx], set M[k, idx] ← 0.
14: end for
15: lpivots[idx] ← pivot from M[:, idx].
16: end for
17: end while
18: end for  
PIA
19: Denote by PIA and IB the matrices such that M = .
IB
20: Compute S i = {β j ∈ B : IB [ j, i] = 0} for all 1 ≤ i ≤ |A|.
 
21: Set I = 1ai β ∈S i M[ j, i]β j 1≤i≤|A| , where ai = lbirths[i].
j
22: return I, PIA

 and 
Proposition 3.17 Algorithm 2 computes K I bases for the kernel and image of f .
2 Here we use the Numpy notation for matrices, where for a matrix M, the (i, j)-entry is denoted by
M[i,j] and the jth column is denoted by M[:,j]. Also, we use indexing starting at 1 instead of 0 to be
consistent with the rest of the article. In addition, we make use of the pop() function, which simultaneously
returns and deletes the last element from a list.

123
Discrete & Computational Geometry (2023) 70:580–619 595

Algorithm 2 image_kernel
Input: A, B, f (A)B ; A and B follow the startpoint and endpoint orders resp.
 
Output: K, I, P 
I
1: I, PIA ← box_gauss_reduce (A, B, f (A)B ).
2: Compute T i = α j ∈ A : PIA [ j, i] = 0 for all 1 ≤ i ≤ |A|.
3: Set PI = α ∈T i PIA [ j, i]α j 1≤i≤|A| .
j
4: Set GK to contain all 1ci ( pi ) for all pi ∈ PI with f ( pi ) ∼ [ai , ci ).
5: Order GK with standard order, while A with the endpoint order.
6: Write the matrix of coordinates, GKA , of GK in terms of A.
7: K, _ ← box_gauss_reduce (GK, A, GKA )
8: We get rid of zero elements to obtain K  and 
I from K and I.
9:  
return K, I and P I (where P  I are the preimages of I taken from PI).

Proof First of all, by Proposition 3.16, we know that  I and K are linearly independent,
as these are both sets of persistence vectors with different pivots. Thus, all we need to
show is that both sets generate PVect (Im( f )) and PVect (Ker( f )), respectively.
Let us prove that 
I generates PVect (Im( f )). First, wewill show that PI generates
PVect(V). Consider γ ∈ PVect(V) and write γ = 1a i∈I ki αi for coefficients
ki ∈ F\{0} with i ∈ I for some subset I ⊆ {1, 2, . . . , |A|}. Then, consider the
maximum index m from I and compute  γ = γ  (−km pm ). Here it is key to recall
that the preimage pm ∈ PI iswritten as αm  i=1 m−1
km,i αi for coefficients km,i ∈ F
for 1 ≤ i < m. Now,  γ = 1a i∈J ki αi for coefficients ki ∈ F\{0} with i ∈ J for
some subset J ⊆ {1, 2, . . . , m − 1}. Repeating this argument, eventually, we write
γ in terms of GI. This implies that f (γ ) can be expressed in terms of  I. Thus, I
generates PVect (Im( f )).
Now, let us show that K generates PVect (Ker( f )). In fact, it will be enough to show
that GK generates PVect (ker( f )). This is because K is obtained from reducing GKA in
a similar manner as  I was obtained by reducing f (A)B . Consequently, by replicating
the argument which proved that PI generates PVect(V), it follows that K  generates
PVect (ker( f )). So let us prove our claim. Suppose that γ : F[a,b) →V lies in the kernel;

i.e., f (γ ) = Z a . As PI generates PVect(V), we have that γ = 1a i∈I ki pi for
coefficients ki ∈ F\{0} with i ∈ I for some subset I ⊆ {1, 2, . . . , |A|}. Applying f ,
we obtain the equality f (γ ) = Z a = i∈I ki f ( pi ) and notice that f ( pi )(a) = 0 for
all i ∈ I ; otherwise linear independence of  I, and in particular that of  I a (1F ), would
be contradicted. However, if f ( pi )(a) = 0 for all i ∈ I , then 1ci ( pi ) ∈ GK for some
ci ≤ a and all i ∈ I . Altogether we obtain that γ must be generated by GK. 

Notice that Algorithm 1 for the cases from Proposition 3.17 is simply a Gaussian
elimination where the procedure differs from the standard method. In the former case,
the input A in Algorithm 1 is given ordered in the startpoint order. However, this
hypotheses is not assumed in Sect. 3.4. Next, we give a computational bound.
Proposition 3.18 Algorithm 2 takes at most O(N 2 |A|) time, where N = max(|A|, |B|).
Proof Let us start by measuring the computational complexity of Algorithm 1,
box_gauss_reduce. First, the for loop from line 4 iterates |B| times. Next, for
each iteration of line 4 the pop() function from line 9 cannot be executed more
than L times, where L ≤ |A|. Then, lines 9 to 15 have a computational cost of

123
596 Discrete & Computational Geometry (2023) 70:580–619

about |B|. Altogether, the computational complexity of box_gauss_reduce is


about O(|A||B|2 ). Now, let us compute the complexity of Algorithm 2. Executing
line 1 should take O(|A||B|2 ) time, while executing line 7 should take about O(|A|3 )
time. This leads to the expected result. 


3.4 Computing Quotients

Suppose that we have inclusions H ⊆ G ⊆ V of finite dimensional persistence


modules, together with barcode bases H, G, and A respectively. Suppose that H and
G are ordered using the standard order, while A is ordered using the endpoint order.
Consider the inclusions ιH : H → V and ιG : G → V together with their respective
associated matrices ιH (H)A ∈ M|A|×|H| (F) and ιG (G)A ∈ M|A|×|G | (F). Without
loss of generality, we assume that ιH (H)A is already reduced and ιH (H) is a barcode
base for ιH (H). Given all this data, the aim is to find a barcode base for G/H.
Let H ⊕ G together with a barcode base given by the pair (H | G); here we
extend the orders from H and G with the rule h < g for any pair of generators
h ∈ H and g ∈ G; of course, this might break the standard persistence vector order.
Then, we consider ι = ιH + ιG : H ⊕ G → V which will have the associated block
matrix (ιH (H)A | ιG (G)A ). We send the triple ((H|G), A, (ιH (H)A | ιG (G)A )) to the
box_gauss_reduce algorithm and obtain the output I, PIA . We focus on the
subset I[G] containing the last |G| elements from I.
Recall that the box_gauss_reduce algorithm adds columns from ιH (H)A and
G
ι (G)A to eventually obtain a matrix I[G]A from which we deduce the elements in
I[G]. Further, using PIA , we are able to know exactly which combinations of columns
were performed in the reduction procedure. Thus, each Γ ∈ I[G] can be written
as Γ = Γ H  Γ G where Γ H and Γ G denote the respective linear combinations of
elements from ιH (H) and ιG (G). Given Γ ∈ I[G], we use the notation Γ G ∼ [aΓ , dΓ )
and Γ ∼ [bΓ , cΓ ) for the corresponding associated intervals; in particular, notice that
aΓ ≤ bΓ . Then, we define the persistence vector

V
Γ G : F[aΓ ,cΓ ) → ,
H

which is defined by Γ G (r ) = pH (r ) ◦ Γ G (r ) for all r ∈ [aΓ , cΓ ), where we use


the projection pH : V  V/H. We claim that Γ G is well defined, i.e., Γ G (r ) = 0
iff r ∈ [aΓ , cΓ ). First, notice that Γ G (cΓ ) = 0 since by definition Γ (cΓ ) = 0,
which implies Γ G (cΓ ) = −Γ H (cΓ ). Next, we need to show that Γ G (r ) = 0 for all
r ∈ [aΓ , cΓ ). In fact, we prove the stronger statement that

 = {Γ G : Γ ∈ I[G] such that aΓ < cΓ }


Q

is linearly independent. Take a subset S ⊆ I[G] such that {Γ G }Γ ∈S is a nonempty


subset of Q. Also, take some coefficients kΓ ∈ F\{0} for all Γ ∈ S. We want to
show that V G := Γ ∈S kΓ Γ G is associated to the interval [A, C), where we use the
notation A = maxΓ ∈S (aΓ ) and C = maxΓ ∈S (cΓ ). By contradiction, suppose that

123
Discrete & Computational Geometry (2023) 70:580–619 597

V G is associated to [A, r ) for some value r ∈ [A, C). This implies that 1r (V G ) is in
PVect(H), where we define V G := Γ ∈S kΓ Γ G .
Next, take the greatest Γ ∈ S whose endpoint is C. Thus, there must exist R ∈ R
such that r ≤ R < C and 1 R (Υ ) = Z R for all Υ ∈ S such that Υ > Γ . Equivalently,
1 R (Υ G ) ∈ PVect(H) for all Υ ∈ S such that Υ > Γ . Since 1 R (V G ) is in PVect(H),
we conclude that 1 R (Γ G ) can be written in terms of generators from ιH (H) R and
elements ΓG with Γ < Γ ; where, from the third bullet point after Definition 3.1,
ιH (H) R denotes the set of generators γ ∈ ιH (H) such that γ (R) = 0. Now, consider
the matrix I[G]G G
A whose columns correspond to the coordinates of Υ in terms of A
for all Υ ∈ I[G]. Then, consider the block matrix M = (ι (H)A | I[G]G
H R
A ). In matrix
G
terms, our hypotheses on 1 R (Γ ) means that its corresponding column from M can
be reduced by left to right column additions on M up to a pivot whose associated
interval endpoint is smaller or equal to R.
Now, denote N = (ιH (H)A R | ιG (G) ). It is not difficult to notice that M is the
A
result of applying left-to-right column additions to N . Consequently, denoting by CΓ
the column from N that corresponds to Γ G , the column CΓ can be reduced by left
columns in N up to a pivot whose associated interval endpoint is smaller or equal to R.
There are two options:

• Assume Γ ∼ [bΓ , C), with bΓ < C. By hypotheses, CΓ is a combination of pre-


vious columns up to a pivot with death value ≤ R. Thus, following the instructions
from Algorithm 1, we should reduce CΓ (at least) to a pivot with death value ≤ R.
However, in such a case, Γ would be associated to an interval with endpoint ≤ R.
But R < C, reaching a contradiction.
• Assume Γ = Z C . There exists a pivot of index p in the reduction process where
we first add to CΓ a column C  with start value strictly bigger than R; as Γ = Z C
and aΓ ≤ r ≤ R < C. By step 7 in Algorithm 1 and by our hypotheses on CΓ ,
such a pivot must have an endpoint smaller or equal to R. However, the column C 
has a startpoint strictly bigger than R. This contradicts Corollary 3.12 as well as
step 13 in Algorithm 1.

It can be shown that Q generates G/H by a similar reasoning as used in Proposi-


 is a barcode base for the quotient.
tion 3.17. Consequently, Q

3.5 Homology of Persistence Modules

Consider a chain of tame persistence modules:

d1 d2 dn
0 V0 V1 V2 ... Vn ,

where each term has basis B j for 0 ≤ j ≤ n. Then applying image_kernel we


obtain bases I j−1 and K j for the image and kernel
 of d j for all 0 ≤ j ≤ n. Proceed- 
ing as on the previous section, we send triples (I j | K j ), B j , ((I j )B j | (K j )B j ) to
box_gauss_reduce for all 0 ≤ j ≤ n. This leads to bases Q j for the quotients
Ker(d j )/Im(d j+1 ) for all 0 ≤ j ≤ n.

123
598 Discrete & Computational Geometry (2023) 70:580–619

4 A Review on the Mayer–Vietoris Spectral Sequence

In this section, we give an introduction to the Mayer–Vietoris spectral sequence. These


ideas come mainly from [3, 20]. Here, we outline a minimal, self-contained explanation
of the procedure. Also, this is used in Sect. 5. For simplicity we focus on ordinary
homology over a field F. Later, in Sect. 5, we go back to the case of persistent homology
over a field.

4.1 The Mayer–Vietoris Long Exact Sequence

Consider a torus T2 covered by two cylinders U and V , as illustrated in Fig. 2. Naively,


one could think that Hn (T2 ) ∼
= Hn (U ) ⊕ Hn (V ) for all n ≥ 0. However, this does not
hold in dimensions 0 and 2:

H0 (T2 ) = F  F ⊕ F = H0 (U ) ⊕ H0 (V ), H2 (T2 ) = F  0 = H2 (U ) ⊕ H2 (V ).

To amend this, one has to look at the information given by the intersection U ∩ V .
This information comes as identifications and new loops. For example, U and V are
connected through U ∩ V . Also, the loop going around each cylinder U and V is
identified in U ∩ V . These identifications are performed by taking the quotient
 
In := coker Hn (U ∩ V ) → Hn (U ) ⊕ Hn (V )

for all n ≥ 0. Where the previous morphism is the Čech differential δ1n : Sn (U ∩ V ) →
Sn (U ) ⊕ Sn (V ). Additionally, the 1-loops in the intersection merge to the same loop
when included in each cylinder U or V . This situation creates a 2-loop or “void”, see
Fig. 2. Thus we have the n-loops detected by the kernel
 
L n := Ker Hn−1 (U ∩ V ) → Hn−1 (U ) ⊕ Hn−1 (V )

for all n ≥ 0. Notice that n-loops are found by n − 1 information on the intersection.
Putting all together, we have that

H0 (T2 ) ∼
= I0 ∼
= F, H1 (T2 ) ∼
= I1 ⊕ L 1 ∼
= F⊕ F, H2 (T2 ) ∼
= L2 ∼
= F.

On a more theoretical level, what we have presented here is commonly known as


the Mayer–Vietoris Theorem. That is, Hn (U ∪ V ) is a filtered object,

{0} = F −1 (Hn (U ∪ V )) ⊂ F 0 (Hn (U ∪ V )) ⊂ F 1 (Hn (U ∪ V )) = Hn (U ∪ V ),

and there are expressions for the different ratios between consecutive filtrations,

F 1 (Hn (U ∪ V ))
F 0 (Hn (U ∪ V )) = In , = Ln.
F 0 (Hn (U ∪ V ))

123
Discrete & Computational Geometry (2023) 70:580–619 599

U U ∩V V U U ∩V V

Identifications: ∼ ∼

H0 H1

Loops: ∼
= S
1 ⇒ ‘Inside’

Fig. 2 Torus covered by a pair of cylinders U and V

In particular, as we are working with vector spaces, Hn (U ∪ V ) ∼


= In ⊕ L n for all
n ≥ 0.
The above discussion gives rise to the total chain complex,

Totn (S∗ ) = Sn (V ) ⊕ Sn (U ) ⊕ Sn−1 (U ∩ V ),

with morphism dnTot = (d, d, d − δ1 ) for all n ≥ 0. Notice that the first two morphisms
do not change components, whereas the third encodes the “merging” of information.
This last morphism is represented by red arrows on the diagram:

Totn+1 (S∗ ) ∼
= Sn+1 (U ) ⊕ Sn+1 (V ) ⊕ Sn (U ∩ V )
Tot
dn+1 dn+1 dn
δ1
Totn (S∗ ) ∼
= Sn (U ) ⊕ Sn (V ) ⊕ Sn−1 (U ∩ V )
dnTot dn dn−1
δ1
Totn−1 (S∗ ) ∼
= Sn−1 (U ) ⊕ Sn−1 (V ) ⊕ Sn−2 (U ∩ V )

where the rectangle of red arrows is commutative. In particular, this implies that
dnTot ◦ dn+1
Tot = 0 for all n ≥ 0. Computing the homology with respect to the total

differentials and using the previous characterization of In and L n , one obtains

Hn (Tot∗ (S∗ )) ∼
= In ⊕ L n ∼
= Hn (K ).

This result is generalized in Proposition 4.1.

123
600 Discrete & Computational Geometry (2023) 70:580–619

4.2 The Mayer–Vietoris Spectral Sequence

Consider a simplicial complex K with a covering U = {Ui }i=0 m by subcomplexes. We

can extend the intuition from the previous subsection, by recalling the definition of the
(n, U)-Čech chain complex given on the preliminaries. Stacking all these sequences
on top of each other, and also multiplying differentials in odd rows by −1, we obtain
a diagram:

δ0 δ1 δ2
0 S2 (K ) S2 (Uσ ) S2 (Uσ ) S2 (Uσ ) ...
σ ∈Δm
0 σ ∈Δm
1 σ ∈Δm
2
d d d d

−δ0 −δ1 −δ2


0 S1 (K ) S1 (Uσ ) S1 (Uσ ) S1 (Uσ ) ...
σ ∈Δm
0 σ ∈Δm
1 σ ∈Δm
2
d d d d

δ0 δ1 δ2
0 S0 (K ) S0 (Uσ ) S0 (Uσ ) S0 (Uσ ) ...
σ ∈Δm
0 σ ∈Δm
1 σ ∈Δm
2

0 0 0 0

This leads to a double complex (S∗,∗ , δ̄, d) defined as



S p,q := Sq (Uσ )
σ ∈Δmp

for all p, q ≥ 0, and also S p,q := 0 otherwise. We denote δ̄ = (−1)q δ, the Čech
differential multiplied by a −1 on odd rows. The reason for this change of sign is
because we want S∗,∗ to be a double complex, in the sense that the following equalities
hold:

δ̄ ◦ δ̄ = 0, d ◦ d = 0, δ̄ ◦ d + d ◦ δ̄ = 0. (5)

Since S∗,∗ is a double complex, we can study the associated chain complex S∗Tot ,
commonly known as the total complex. This is formed by taking the sums of anti-
diagonals

SnTot := S p,q
p+q=n

for n ≥ 0. The differentials on the total complex are defined by d Tot = d + δ̄,
which satisfy d Tot ◦ d Tot = 0 from (5), see Fig. 3 for a depiction of this. Later, in
Proposition 4.1, we prove that Hn (K ) ∼= Hn (S∗Tot ) for all n ≥ 0. The problem still

123
Discrete & Computational Geometry (2023) 70:580–619 601

S∗,∗ S∗,∗
β0 β0
d d
α0 β1 0 β1
δ̄ δ̄
d S4Tot d Ker(d Tot )4
α1 β2 0 β2
δ̄ δ̄
S3Tot d d
α2 β3 0 β3
δ̄ δ̄
d d
α3 β4 0 β4
δ̄ δ̄
Fig. 3 S∗,∗ represented as a lattice for convenience. On the left, the total complex S Tot associated to S∗,∗ .
Here (β0 , . . . , β4 ) ∈ S4Tot maps to (α0 , . . . , α3 ) ∈ S3Tot , where d(βi ) + δ̄(βi+1 ) = αi for all 0 ≤ i ≤ 3.
On the right, the kernel Ker(d Tot )4 , where d(βi ) + δ(βi+1 ) = 0 for all 0 ≤ i ≤ 3

remains difficult, since computing Hn (S∗Tot ) directly might be even harder than com-
puting Hn (K ). The key is that the Mayer–Vietoris spectral sequence allows us to break
apart the calculation of Hn (S∗Tot ) into small, computable steps.
Let us start by computing the kernel Ker(dnTot ), which is depicted in Fig. 3.
Recall that in this section we are working with vector spaces and linear maps.
Let s = (sk,n−k )0≤k≤n ∈ SnTot be in Ker(dnTot ). Then, the equations d(sk,n−k ) =
−δ̄(sk+1,n−k−1 ) hold for all 0 ≤ k < n. This leads to subspaces GK p,q ⊆ S p,q com-
posed of elements s p,q ∈ S p,q such that d(s p,q ) = 0 and such that there exists
a sequence s p−r ,q+r ∈ S p−r ,q+r with d(s p−r ,q+r ) = −δ̄(s p−r +1,q+r −1 ) for all
0 < r ≤ p. Notice that GK p,q is a subspace of S p,q since both d and δ̄ are lin-
ear. This is depicted in Fig. 4. There are (non-canonical) isomorphisms,

  
Ker dnTot ∼= GK p,q . (6)
p+q=n

It turns out that (6) only holds when we are working with vector spaces. Later, we
work with a more general case where we have to solve nontrivial extension problems.
By (6), recovering all GK p,q leads to the kernel of d∗Tot . However, computing
GK p,q still requires a large set of equations to be checked. A step-by-step way of
computing these is by adding one equation at a time. For this, we define the subspaces
GZrp,q ⊆ S p,q where we add the first r equations progressively. That is, we start
setting GZ0p,q = S p,q . Then we define GZ1p,q to be elements s p,q ∈ S p,q such that
d(s p,q ) = 0, or equivalently GZ1p,q = Ker(d) p,q . For r ≥ 2, we define GZrp,q to be
formed by elements s p,q ∈ Ker(d) p,q such that there exists a sequence s p−k,q+k ∈
S p−k,q+k with d(s p−k,q+k ) = −δ̄(s p−k+1,q+k−1 ) for all 1 ≤ k < r . Altogether,

p+1 p
GK p,q = GZ p,q ⊆ GZ p,q ⊆ . . . ⊆ GZ0p,q = S p,q

123
602 Discrete & Computational Geometry (2023) 70:580–619

GK0,3
β0 β0 GZ02,1 GZ12,1
d d GK1,2 = S2,1 = Ker(d)2,1
0 0 0 β1 α1 β2 α1 β2
δ̄ δ̄
d d
0 0 0 0 α2 0
δ̄ δ̄
d d
0 0 0 0 α0 β1
δ̄ δ̄ GZ22,1

0 β2

β0 β0 β0 0
d d
GZ32,1
0 β1 0 β1 0 β1
δ̄ δ̄ = GK2,1
d GK2,1 d
0 β2 0 β2 0 β2
δ̄ δ̄
d d GK3,0
0 0 0 β3 0
δ̄ δ̄

Fig. 4 On the left, in cyan the four direct summands of Ker(d Tot )4 . The corresponding GKr ,3−r are framed
to indicate that these are subspaces of Sr ,3−r for all 0 ≤ r ≤ 3. On the right, in orange the subspaces GZr2,1 ,
eventually shrinking to GK 2,1 . For convenience, we denote α2 = d(β2 ), α1 = δ̄(β2 ) and α0 = δ̄(β1 )

for all p, q ≥ 0. For intuition see Fig. 4, and also Fig. 6 for a depiction of GZ23,1
on a lattice. We also write GZrp,q = Ker(d) ∩ (δ̄ −1 ◦ d)r −1 (S p−r +1,q+r −1 ) for all
r ≥ 1, where by (δ̄ −1 ◦ d)r we denote composing r times δ̄ −1 ◦ d. In particular, since
GZrp,q = GZ p,q for all r ≥ p + 1, we use the convention GZ∞
p+1 p+1
p,q := GZ p,q = GK p,q .
Now, we explain the notation GK p,q and the isomorphism (6). We start defining a
vertical filtration F ∗ on S∗,∗ by the following subcomplexes for all r ≥ 0:

S p,q whenever p ≤ r ,
F r (S∗,∗ ) p,q :=
0 otherwise.

This is an increasing filtration, i.e., F r (S∗,∗ ) ⊆ F r +1 (S∗,∗ ) for all r ≥ 0. Additionally,


there are isomorphisms F r (S∗,∗ )/F r −1 (S∗,∗ ) ∼ = Sr ,∗ for all r ≥ 0. See Fig. 5 for
a depiction of F ∗ . Also, F ∗ respects the differentials from S∗,∗ in the sense that
d(F r (S∗,∗ )) ⊆ F r (S∗,∗ ) and δ̄(F r (S∗,∗ )) ⊆ F r (S∗,∗ ) for all r ≥ 0. Consequently,
F ∗ filters the total complex S∗Tot , respecting its differential d Tot . That is, SnTot is filtered
p≤r
by subcomplexes, F r SnTot := p+q=n S p,q , for all r ≥ 0.
In particular, Ker(d ) also inherits the filtration F ∗ , where F r Ker(d Tot )n =
Tot

F r SnTot ∩ Ker(d Tot )n for all r ≥ 0. We define the associated modules of Ker(d Tot )n
to be the quotients G p Ker(d Tot )n = F p Ker(d Tot )n /F p−1 Ker(d Tot )n , which can be
checked to be isomorphic with GK p,q for all p + q = n. This follows by considering

123
Discrete & Computational Geometry (2023) 70:580–619 603

F 2 (S∗,∗ ) Sp, ∗

δ̄
S∗,∗
d
F 3 (S ∗,∗ )

Fig. 5 Note that F 3 (S∗,∗ )/F 2 (S∗,∗ ) ∼


= S3,∗ . Also notice that δ̄ and d respect the vertical filtration F ∗

morphisms

 
G p Ker d Tot n GK p,q ,
(7)
[(s0,n , s1,n−1 , . . . , s p,q , 0, . . . , 0)] s p,q ,

which are well defined since s p,q does not change for representatives of the same class.
In fact, this morphism is injective since two classes with the same image will be equal
by definition of G p Ker(d Tot )n . On the other hand, the definition of GK p,q ensures
surjectivity. In particular, as we are working with vector spaces, we have that
     
Ker dnTot ∼= G p Ker d Tot n ∼
= GK p,q ,
p+q=n p+q=n

which justifies isomorphism (6).


Next, we explain the notation GZrp,q . We introduce the objects

Z rp,q := z ∈ F p S Tot
p+q : d
Tot
(z) ∈ F p−r S Tot
p+q−1

for all r ≥ 0. We can think of these as kernels of d Tot up to some previous fil-
tration. Then, by definition, we have that Z 0p,q = F p S Tot
p+1 ∞
p+q and Z p,q = Z p,q =
F Ker(d p+q ). Using a morphism analogous to (7), one can check that the quotients
p Tot
+1 /Z r r +1
Z rp,q p−1,q+1 are isomorphic to GZ p,q for all p + q = n. This is depicted in Fig. 6.
Thus, computing these quotients increasing r ≥ 0 leads to Ker(d Tot ). With a little
more work, we can do the same for computing the homology.
The Mayer–Vietoris spectral sequence leads to Hn (S∗Tot ) after a series of small,
computable steps. This is done similarly as we did before for computing Ker(d Tot ).
However, in this case we need to take quotients by the images of d Tot . First, notice that
the vertical filtration F ∗ transfers to homology Hn (S∗Tot ) by the inclusions F p S∗Tot ⊆
S∗Tot for all p ≥ 0. That is, we have filtered sets

      
F p Hn S∗Tot := Im Hn F p S∗Tot −→ Hn S∗Tot ,

123
604 Discrete & Computational Geometry (2023) 70:580–619

1
Z2,2
S∗,∗ S∗,∗
2
Z3,1

F 2 S3Tot d

δ̄
d

δ̄
d GZ23,1 ∼ 2 /Z 1
= Z3,1 2,2

δ̄
d

2 and Z 1 . On the right their respective quotient GZ2


Fig. 6 On the left the sets Z 3,1 2,2 3,1

which induce a filtration on Hn (S∗Tot ). For this filtration, the associated modules are
defined by the quotients G r Hn (S∗Tot ) = F r Hn (S∗Tot )/F r −1 Hn (S∗Tot ) for all r ≥ 0. As
we are working over a field, we recover the homology by taking direct sums:

  
n
 
Hn S∗Tot ∼= G r Hn S∗Tot . (8)
r =0

Previously, we defined the sets Z rp,q which are kernels “up to filtration”. In an analo-
gous way, we define boundaries “up to filtration” by setting3

B rp,q := d Tot (c) ∈ F p S Tot


p+q : c ∈ F S p+q+1
p+r Tot

for all r ≥ 0 and p, q ≥ 0. These are images of d Tot coming from a previous filtration.
In particular, the equalities d Tot (Z rp,q ) = B rp−r ,q+r −1 and d Tot (B rp,q ) = 0 hold.
Additionally, for all p, q ≥ 0, there is a sequence of inclusions,

B 0p,q ⊂ B 1p,q ⊂ . . . ⊂ B p,q = B ∞ ∞


q+1 p
p,q ⊂ Z p,q = Z p,q ⊂ . . . ⊂ Z p,q ⊂ Z p,q .
1 0

Hence, we define the first page of the spectral sequence as the quotient

Z 1p,q GZ1p,q
E 1p,q := ∼
= ,
Z 0p−1,q+1 + B 0p,q Im(B 0p,q → GZ1p,q )

for all p, q ≥ 0. Recall that Ker(d) p,q = GZ1p,q = Z 1p,q /Z 0p−1,q+1 and also one
can see that Im (B 0p,q → GZ1p,q ) is isomorphic to Im(d) p,q . Then we deduce that
E 1p,q ∼
= Hq (S p,∗ , d). On this page d Tot induces differentials d 1 : E 1p,q → E 1p−1,q .

3 Here we have adopted the definition of Z r r


p,q and B p,q that one can find in [20]. Other sources such as
[3] and [19] use the same notation for other terms.

123
Discrete & Computational Geometry (2023) 70:580–619 605

GZ32,1 IB02,1

GZ22,1 IB12,1 2
E1,2

GZ12,1 IB22,1 2
E3,1

IB02,1 ⊂ IB12,1 ⊂ IB22,1 ⊂ GZ32,1 ⊂ GZ22,1 ⊂ GZ12,1

r → GZr +1 ), for all 0 ≤ r ≤ 2.


Fig. 7 On the left, the different subspaces on S2,1 . Here IBr2,1 = Im (B2,1 2,1
The framed region represents S2,1 . Brighter colours represent bigger regions than darker colours. Note that
∗ ∗
blue and orange colours have been assigned to GZ2,1 and IB2,1 respectively. On the right, the morphism
2 → E 2 on the second page. The two framed regions represent the codomain and domain of d 2 ,
d 2 : E 3,1 1,2
these have been assigned brighter and darker colours, respectively

I.e., noticing that d Tot (Z 1p,q ) = B 1p−1,q ⊂ Z 1p−1,q and also d Tot (Z 0p−1,q+1 + B 0p,q )
= d Tot (Z 0p−1,q+1 ) + 0 = B 0p−1,q we have that d 1 : E 1p,q → E 1p−1,q is well defined.
Next, we compute the second page. First, notice that

Z 2p,q Z 2p,q B 1p,q


Ker(d 1 ) = = , Im(d 1 ) = .
Z 2p,q ∩ (Z 0p−1,q+1 + B 0p,q ) Z 1p−1,q+1 + B 0p,q B 0p,q

Since d Tot ◦ d Tot = 0 we also have d 1 ◦ d 1 = 0 and, in particular, Im(d 1 ) ⊆ Ker(d 1 ).


Then, the second page terms are given by

Ker(d 1 ) Z 2p,q
E 2p,q := H p,q (E ∗,∗
1
, d 1) = = .
Im(d 1 ) Z 1p−1,q+1 + B 1p,q

The second page has differential d 2 induced by the total complex differential d Tot .
Figure 7 illustrates this principle. Doing the same for all pages we obtain the definition
of the r -page terms

r −1 r −1
Z rp,q
E rp,q := H p,q (E ∗,∗ ,d ) = −1 −1
,
Z rp−1,q+1 + B rp,q

for all r ≥ 2. Of course, we can express alternatively the r -page terms as

GZrp,q
E rp,q := −1
.
Im (B rp,q → GZrp,q )

123
606 Discrete & Computational Geometry (2023) 70:580–619

Thus, the ∞-page is

Z∞ GK p,q
E∞
p,q ∼
p,q = = .
Z∞
p−1,q+1 + B∞
p,q Im (B ∞
p,q→ GK p,q )

Then, for n = p + q, one has that G p Hn (S∗Tot ) is isomorphic to

F p Hn (S∗Tot ) Im (Hn (F p S∗Tot ) → Hn (S∗Tot ))


=
F p−1 Hn (S∗Tot ) Im (Hn (F p−1 S∗Tot ) → Hn (S∗Tot ))
Z∞ ∞
p,q /B p,q
∼ ∼
= E∞
= ∞ p,q ,
Z p−1,q+1 /B ∞p−1,q+1

since B ∞ ∞
p−1,q+1 ⊆ B p,q . Therefore, computing the spectral sequence is a way of
approximating the associated module G p Hn (S∗Tot ). By (8), adding up these leads to
Hn (S∗Tot ). Also, since E ∞ ∼ p ∗
p,q = G Hn (S∗ ), we say that E p,q converges to Hn (S )
Tot Tot

and we denote this as E ∗p,q ⇒ Hn (S Tot ). As the rows from S p,q are exact, the following
result is standard; see for example [3, Prop. 8.8] for a similar proof.

Proposition 4.1 Hn (S∗Tot ) ∼


= Hn (K ).
q≤r
Proof Consider the horizontal filtration FH by FHr SnTot := p+q=n S p,q . This whole
section can be adapted to this filtration and one obtains a corresponding spectral
sequence H E ∗p,q ⇒ Hn (S Tot ). As the Čech sequence (3) is exact, H E ∗p,q collapses on
1
the first page, where the only nontrivial terms are H E 0,n Hn (K ) for all n ≥ 0. This
implies the claim. See [20, Thm. 2.15] for a general result on the spectral sequence of
a double complex. 


Therefore, using Proposition 4.1, we have that E ∗p,q ⇒ Hn (K ). In particular, since we


are working with vector spaces, Hn (K ) ∼= ∞
p+q=n E p,q for all n ≥ 0. Throughout
the following section, we adapt these results to the category of persistence modules.

5 Persistent Mayer–Vietoris

One can translate the method from Sect. 4 to PMod. The reason for this is that PMod is
an abelian category, since Vect is an abelian category and R is a small category. The
theory of spectral sequences can be developed for arbitrary abelian categories. For an
introduction to this, see [30, Chap. 5].
Consider a filtered simplicial complex, K , together with a cover of K by filtered
subcomplexes, U = {Ui }i∈I , so that K = i∈I Ui . Then, the persistence Mayer–
Vietoris spectral sequence is given by

E 1p,q = PHq (Uσ ) ⇒ PHn (K ),
σ ∈Δmp

123
Discrete & Computational Geometry (2023) 70:580–619 607

U V U V U V

r =0 r = 0.5 r = 0.6
Fig. 8 As the radius increases, more edges are added. At radius r = 0.5 a circle will be across the two
covers U and V . Later on, at radius r = 0.6 this circle will be split into two


E1,0

E0,1

PH1 (K)

0.5 0.6 1.0 r

Fig. 9 Barcode on associated module

where p + q = n. However, unlike the case of vector spaces, we might have that
∞ ∞ ∼
p+q=n E p,q  PHn (K ). All that we know is that E p,q = G PH p+q (K ) for all
p

p, q ≥ 0. In the literature, recovering PHn (K ) from the terms G PH p+q (K ) is known


p

as the extension problem, which we solve in Sect. 5.1. Furthermore, we even obtain
more information; as pointed out in [31], the knowledge of which subset J ⊂ I
detects a feature from PHn (K ) can potentially add insight into the information given
by ordinary persistent homology. The following example illustrates this.
Example 5.1 Consider the case of a point cloud X covered by two open sets as in
∞ )r associated
Fig. 8. From Sects. 3 and 4, we know how to compute the ∞-page (E ∗,∗
to any value r ∈ R. In particular, when we take r = 0.5, then the combination of
U and V detects a 1-cycle. On the other hand, when r = 0.6 this cycle splits into
two smaller cycles which are detected by U and V individually. Notice that if we
want to come up with a persistent Mayer–Vietoris method then we need to be able to
track this behaviour. That is, we need to know how cycles develop as r increases. In
∞ ∼ F
particular, the barcode from PH1 (X ) is broken down into E 1,0 = [0.5,0.6) and also
∞ ∼
E 0,1 = F[0.6,1.0) ⊕ F[0.6,1.0) , see Fig. 9.
Figure 10 illustrates a filtered complex covered by three regions where, as in Exam-
ple 5.1, there is a nontrivial extension problem.

5.1 The Extension Problem

Recall the definition of the total complex, vertical filtrations and associated mod-
ules from Sect. 4. Through this section we study the extension problem, that is, we
recover Hn (S∗Tot ) from the associated modules G p (Hn (S∗Tot )). Also, we assume that

123
608 Discrete & Computational Geometry (2023) 70:580–619

r =0 r ∼ 0.208 r = 0.5

Fig. 10 A one loop is detected at value r ∼ 0.208 which goes through three covers. Later, at radius r = 0.5,
this loop splits into three loops, each included in one of the three covers

the spectral sequence collapses after a finite number of pages. Consider the persistence
module

V = V(n) := Hn (S∗Tot ),

together with the corresponding filtration

0 = F −1 V ⊂ F 0 V ⊂ . . . ⊂ F n V = V.

We define the associated modules of (V, F ∗ ) as the quotients Gk = F k V/F k−1 V for
all 0 ≤ k ≤ n. This gives rise to short exact sequences,

ι pk
0 F k−1 V FkV Gk 0, (9)

for all 0 ≤ k ≤ n. On the sequences (9) we consider successive extension problems


where we know the first and last term and want to know the middle term for 0 ≤
k ≤ n. This leads to V; however, in this work we obtain directly a persistence module
isomorphic to V (see Proposition 5.2). Adding up all associated modules we obtain
n
a persistence module G := i=0 Gi with an additional filtration given by F k G =
k k ∼ ∞
i=0 G for all 0 ≤ k ≤ n. Since G = E k,n−k for all 0 ≤ k ≤ n, a spectral sequence
i

algorithm leads to a barcode basis for G. We formulate the extension problem as


computing a basis for V from the obtained basis G of G.
To start, notice that for each r ∈ R the sequence (9) splits, leading to morphisms

F k (r ) : Gk (r ) → F k V(r ), (10)

such that p k (r ) ◦ F k (r ) = IdGk (r ) for all 0 ≤ k ≤ n. In particular, F k (r ) is injective


for all 0 ≤ k ≤ n. On the other hand, for any class [βk ]∞ ∞
k,n−k of E k,n−k (r ) with
representative βk ∈ E k,n−k (r ), since βk ∈ GK k,n−k (r ), we have that d(βk ) = 0
0

and there exists a sequence of βi ∈ Si,n−i (r ) such that d(βi ) = −δ̄(βi+1 ) for all

123
Discrete & Computational Geometry (2023) 70:580–619 609

0 ≤ i < k. The choice of this sequence determines F k (r ), so that


 
F k (r ) [βk (r )]∞
k,n−k = [(β0 (r ), β1 (r ), . . . , βk (r ), 0, . . . , 0)]n ,
Tot

where [ · ]Tot
n denotes the n-homology class of the total complex. Notice that if we
already computed G from the Mayer–Vietoris spectral sequence, then there is no need
to do any extra computations to obtain these morphisms F k (r ). All we need to do is
to store our previous results. Adding over all 0 ≤ k ≤ n we obtain the isomorphism
F(r ) = nk=0 F k (r ) : n
k=0 G (r ) → V(r ). This last morphism is an isomorphism
k

since all its summands are injective, their images have mutual trivial intersection, and
the dimensions of the domain and codomain coincide.
Recall that G has induced morphisms G(r ≤ s) from V(r ≤ s) for all values r ≤ s
in R. Given a basis G for G, we would like to compute a basis B for V from this
information. Notice that this is not a straightforward problem since (10) does not imply
that one has an isomorphism F : G → V. A point to start is to define the image along
each generator in G. That is, for each barcode generator gi ∼ [ai , bi ) in G, we choose an
image at the start F(ai )(gi (ai )). Then, we set F(r )(gi (r )) := V(ai < r )◦F(ai )(gi (ai ))
for all ai < r < bi . This leads to commutativity of F along each generator gi .
Nevertheless this is still far from even defining a morphism F : G → V.
The solution to the problem above is to define a new persistence module G.  We
 |G |
define G(s) := G(s) for all s ∈ R. Then, if G = {gi }i=1 is a barcode basis for G, by
 for all s ∈ R. Now, given gi ∼ [ai , bi )
Proposition 3.2, G s (1F ) will be a basis of G(s)
a generator in G, we define the morphism G(r  ≤ s) by the recursive formula
⎧ |G |

⎪

⎪  i ≤ s)(g j (bi )(1F )) if r ∈ [ai , bi ), bi ≤ s,
ci, j G(b

 ≤ s)(gi (r )(1F )) :=
G(r j=1

⎪ gi (s)(1F ) if r , s ∈ [ai , bi ),



0 otherwise,

 is
where ci, j ∈ F for all 1 ≤ i, j ≤ |G|. We want to define ci, j in such a way that G
isomorphic to V. For this we impose the commutativity condition

 i ≤ bi )(gi (ai )(1F )) = F(bi )−1 ◦ V(ai ≤ bi ) ◦ F(ai )(gi (ai )(1F )),
G(a

which leads to the equation

|G |

ci, j g j (bi )(1F ) = F(bi )−1 ◦ V(ai ≤ bi ) ◦ F(ai )(gi (ai )(1F )). (11)
j=1

This determines uniquely the coefficients ci, j for all 1 ≤ i, j ≤ |G|. Notice that
 respects the filtration on V, since the right hand side in (11) is a composition of
G
 then ci, j = 0 for
filtration preserving morphisms. In particular, if gi ∈ PVect (F k G),
all 1 ≤ j ≤ |G| such that g j ∈
/ PVect (F k G).

123
610 Discrete & Computational Geometry (2023) 70:580–619

Fix a generator gi ∈ G such that gi ∈ PVect(Gk ) and such that gi ∼ [ai , bi ).


Let us calculate the coefficients ci, j . Suppose that we have a representative  gj =
j j j j ∞
(β0 , β1 , . . . , βk , 0, . . . , 0) ∈ Sn for each generator g j ∈ G, with g j = [βk ]k,n−k .
Tot

Also, for all 0 ≤ q ≤ n we define the subset I q ⊆ {1, . . . , |G|} of indices 1 ≤ j ≤ |G|
such that g j ∈ PVect(Gq ). Then, the coefficients ci, j for j ∈ I k \ {i} are determined
by the following equality in Gk (bi ) (where we use p k from the sequence (9))

  
p k (bi ) [
gi (bi )(1F )]Tot
n = ci, j g j (bi )(1F ).
j∈I k \{i}

Thus, we have
⎛⎡ ⎤Tot ⎞
⎜  ⎟
p k (bi ) ⎝⎣
gi (bi )(1F ) − g j (bi )(1F )⎦ ⎠ = 0.
ci, j 
j∈I k \{i} n

Hence, by (9), there must exist some chain Γ ∈ Sn+1


Tot (b ) such that
i



gi (bi )(1F ) − ci, j 
g j (bi )(1F ) − d Tot Γ (12)
j∈I k \{i}

is contained in F k−1 SnTot (bi ). How do we compute Γ ? We start by searching for the
first page r ≥ 2 such that
⎡ ⎤r

⎣βki (bi )(1F ) − ci, j βk (bi )(1F )⎦
j
= 0, (13)
j∈I k \{i} k,n−k

where [ · ]rk,n−k denotes the class in the r -page in position (k, n − k). Notice that this
r must exist since we assumed that (13) vanishes at the ∞-page. Consequently, there
r −1
exists Γk+r −1 ∈ E k+r −1,n−k−r +2 (bi ) such that

⎡ ⎤r −1

⎣βki (bi )(1F ) − ci, j βk (bi )(1F )⎦
j
− d r −1 (Γk+r −1 ) = 0
j∈I k \{i} k,n−k

r −1
on E k,n−k (bi ). Repeating for all pages leads to Γk+t ∈ E k+t,n−k−t+1
t (bi ) for all
0 ≤ t ≤ r − 1, such that

 r −1

d t
j
βki (bi )(1F ) − ci, j βk (bi )(1F ) − (Γk+t ) = 0, (14)
j∈I k \{i} t=0

123
Discrete & Computational Geometry (2023) 70:580–619 611

where d t(Γk+t ) ∈ Sk,n−k (bi ) is a representative for the class d t (Γk+t ) ∈ E k,n−k
t (bi ).
Notice that (14) holds independently of the representatives, since if we changed some
term, then the other representatives would adjust to the change. In particular, we have
that the k component of (12) vanishes, whereas the k − 1 component will be equal to
 j
βk−1
i
(bi )(1F ) − ci, j βk−1 (bi )(1F ) − δ̄(Γk ).
j∈I k \{i}

Next we proceed to find coefficients ci, j ∈ F so that in Gk−1 (bi ) we get the equality
⎡ ⎤∞

⎣βk−1
i
(bi )(1F ) − ci, j βk−1 (bi )(1F ) − δ̄(Γk )⎦
j

j∈I k \{i} k−1,n−k+1



= ci, j g j (bi )(1F ).
j∈I k−1

Then we proceed as we did on Gk . Doing this for all parameters 0 ≤ r ≤ k, there are
coefficients ci, j ∈ F, and an element Γ ∈ Sn+1
Tot (b ) so that
i

 
gi (bi )(1F ) =
 g j (bi )(1F ) + d Tot Γ.
ci, j 
0≤r ≤k j∈I r

Thus, recalling that ci, j = 0 for all g j ∈


/ PVect (F k G), we have

 i ≤ bi )(gi (ai )(1F )) =
G(a ci, j g j (bi )(1F ).
0≤ j≤|G |

∼
Proposition 5.2 G = V.

Proof Since each F(s) is an isomorphism, and also we have commutative squares:

 ≤s)
G(r
 )
G(r 
G(s)
F (r ) F (s)

V(r ) V(s)
V(r ≤s)

for all r ≤ s, then F must be an isomorphism of persistence modules. 




This gives G= ∼ V, but we still need to compute a barcode basis. In fact, this can be
done by considering a quotient. Define A gi ∈G F[ai ,∞) where gi ∼ [ai , bi ) for
all gi ∈ G; here the A = {αi }1≤i≤|G | denotes the canonical base for A. Consider the
coefficients ci, j for 1 ≤ i, j ≤ |G| from the construction of G and define the sets of

123
612 Discrete & Computational Geometry (2023) 70:580–619

indices Si = {1 ≤ j ≤ |G| : ci, j = 0} for all 1 ≤ i ≤ |G|. We consider a submodule


B ⊆ A such that PVect(B) is generated by
 #
1bi (−αi )  S c
j∈
i, j α j .
i 1≤i≤|G |

Also, notice that B 


1≤i≤|G | F[bi ,∞) and that, by construction, G A/B. Now,
we pick up the canonical base for B and consider the inclusion ι : B→A; this will
lead to an associated matrix (ι(B))A . Thus, we send ((B | A), A, ((ι(B))A | Id|A| )) to
box_gauss_reduce and obtain a basis for the quotient A/B; i.e., a basis for G. 

5.2 P ER M AV ISS

Here we outline a procedure for implementing the persistence Mayer–Vietoris spectral


sequence. In Sect. 5.1 we worked with GZrp,q and IBrp,q , which is very intuitive from
a mathematical perspective. However, it is more efficient to work directly with the sets
Z rp,q and B rp,q . By storing representatives in Z rp,q , we avoid repeating computations
on each page and in the extension problem. Furthermore, this approach allows to
easily track the complexity of the algorithm. For compactness, we work with barcode
bases through this section. Before we study the algorithm, we make some notational
remarks. Given a spectral sequence term E rp,q , we denote by E rp,q its barcode basis.
Additionally, for a generator α ∈ E rp,q , we denote by  α its representative in S Tot
p+q and,
r −1
if r > 0, by α its representative in E p,q . Also, we denote by [ · ] p,q a class in E rp,q .
r

Thus, given α ∈ E rp,q we have that α = [α]rp,q = [ α ]rp,q .


0-Page We start by defining the 0-page as the quotient

F p S Tot 
E 0p,q =
p+q ∼
= S p,q = Sq (Uσ )
F p−1 S Tot
p+q σ ∈N pU

for all pairs of integers p, q ≥ 0. The 0 differential, d 0 , is isomorphic to the standard


chain differential

d 0p,q ∼
= dq : S p,q → S p,q−1 .

In particular, for each simplex σ ∈ NqU , the morphism d 0p,q restricts to a local differ-
ential

dqσ : Sq (Uσ ) → Sq−1 (Uσ ).

Thus, we can compute a basis, Eσ,q1 , for the persistent homology PH (U ) as well as a
q σ
basis for the image Im(dq+1σ ). Putting all of these together, we get a basis for E 1 as
p,q

1 =
the union E p,q U E 1 . Further, for each generator α ∈ E 1 ⊆ PVect(E 1 ),
σ ∈N p σ,q p,q p,q

123
Discrete & Computational Geometry (2023) 70:580–619 613

we store a chain α p ∈ S p,q so that α = [(0, . . . , 0, α p , 0, . . . , 0)]1p,q . We define Ep,q


1

to be the set of representatives in S Tot


p+q given by (0, . . . , 0, α0 , 0, . . . , 0).
1-Page Recall that the first page elements are given as classes in the quotient

Z 1p,q
E 1p,q = .
Z 0p−1,q+1 + B 0p,q

Therefore, for each generator α ∈ E p,q 1 , with α ∼ [a , b ), there is a chain α ∈ S


α α p p,q ,
so that α = [(0, . . . , 0, α p , 0, . . . , 0)] p,q . Then we compute
0

$ %1
d 1p,q (α) = d Tot (0, . . . , 0, α p , 0, . . . , 0) p−1,q = [(0, . . . , 0, δ̄ p (α p ), 0, . . . , 0)]1p−1,q .

U , we consider the local chain (δ̄ (α )) ∈ S (U )


Now, for each simplex τ ∈ N p−1 p p τ q τ
1 ⊆ (E 1 )aα together with c1 ∈ F for all β ∈ J 1
and proceed to compute a subset Jτ,q τ,q β τ,q
and an element Γτ ∈ PVect (Sq+1 (Uτ )) such that
⎛ ⎞
 
δ̄ p (α p ) τ = dq+1 (Γτ )  1aα ⎝ 
J
β∈
c β ⎠,
1
1
β
τ,q

where recall that β denotes a representative in Sq (Uτ ) ⊆ E 0p−1,q . This computation


can be performed by using Algorithm 1. That is, we consider (Eτ,q 1 )aα and a basis

for Im((dq+1 )τ ) and write (δ̄ p (α p ))τ in terms of these; the preimages lead to Γτ .
U , we get a subset J 1
Repeating this for all τ ∈ N p−1 p−1,q ⊆ (E p−1,q ) together with
1 aα

coefficients cβ ∈ F\{0} for all β ∈ J p−1,q and an element Γ p−1 ∈ PVect(E 0p−1,q+1 )
1 1

so that
⎛ ⎞
⎜ ⎟
δ̄ p (α p ) = d 0p−1,q+1 (Γ p−1 )  1aα ⎝
β∈

J 1
cβ1 β ⎠.
p−1,q

 
This leads to d 1p,q (α) = 1aα β∈J p−1,q
1 cβ1 β . Repeating this procedure for all gener-
ators α ∈ E p,q
1 leads to an associated matrix D 1 for d 1 . Using image_kernel,
p,q p,q
we compute bases for the kernel and image, together with the corresponding preim-
ages. Next, we compute a base E p,q 2 for the second page term E 2p,q by applying
box_gauss_reduce to compute the quotient Ker(d 1p,q )/Im(d 1p+1,q ). This also
leads to first page representatives α ∈ PVect(E 1p,q ) for all α ∈ E p,q 2 . Finally,

for each generator α ∈ E p,q , we proceed to find a good representative by using


2

Ep,q
1 together with the computed coordinates of α in terms of E p,q 1 . This leads to

α = (0, . . . , 0, α p , 0, . . . , 0) ∈ S p+q , so that α = [


 Tot α ] p,q . Since by hypotheses
2

[α ]1p,q ∈ Ker(d 1p,q ), we might find α p−1 ∈ E 0p−1,q+1 such that d 0p−1,q (α p−1 ) =

123
614 Discrete & Computational Geometry (2023) 70:580–619

−δ(α p ). Altogether, we set  α ← (0, . . . , 0, α p−1 , α p , 0, . . . , 0) and store it in Ep,q


2 ;

notice that d Tot (


α ) ∈ F p−2 S Tot .
k-Page Suppose that we have computed generators E p,q k ⊆ PVect(E k−1 p,q ), together
with total complex representatives Ep,q for some k ≥ 3. Assume also that d Tot (
k α) ∈
F p−k (S ) for all 
Tot 
α ∈ E p,q . Notice that if k ≥ p + 1, then d p,q = 0. Thus, we
k k

focus on the case that k < p + 1. Let α ∈ E p,q k with α ∼ [a , b ) together with a
α α
α ∈ Ep,q
representative  k with  α = (0, . . . , 0, α p−k+1 , . . . , α p , 0, . . . , 0) so that

$ %k
d k (α) = d Tot (
α ) p−k,q+k−1 = [(0, . . . , 0, δ̄ p−k+1 (α p−k+1 ), 0, . . . , 0)]kp−k,q+k−1 .

We proceed by ‘lifting’ d Tot (


α ) to the k-page. As before, using Algorithm 1 in parallel,
we obtain a subset J p−k,q+k−1
1 ⊆ E p−k,q+k−1
1 together with coefficients cβ1 ∈ F\{0}
for all β ∈ T p−k,q+k−1 ⊆ S p+q−1 giving us an expression in terms of the first
1 Tot
 
α )]1p−k,q+k−1 = 1aα β∈T 1
page basis [d Tot ( cβ1 β . Next, using Algorithm 1,
p−k,q+k−1
we compute a subset J p−k,q+k−1
r ⊆ (E rp−k,q+k−1 )aα together with coefficients cβr ∈
−1
F\{0} for all β ∈ J p−k,q+k−1
r , and Γ p−k+r −1 ∈ PVect (E rp−k+r −1,q+k−r +1 ) such that

⎛ ⎞ ⎛ ⎞
⎜ ⎟
1aα ⎝
J
β∈
 r −1
cβr −1 β ⎠ = d rp−k+r
−1
−1,q+k−r +1 (Γ p−k+r −1 )  1aα

β∈J
r
cβr β ⎠.
p−k,q+k−1 p−k,q+k−1

Thus, we deduce an expression of [d Tot α ]rp−k,q+k−1 in terms of E rp−k,q+k−1 . In par-


ticular, this holds for r = k, which leads to the associated matrix of d kp,q , and then we
can compute image_kernel to obtain bases for images and kernels of d kp,q . Also,
using box_gauss_reduce we obtain basis, E p,q k+1 , for the terms E k+1 . As a byprod-
p,q
uct of this computation, for each α ∈ E p,q we obtain a previous page representative
k+1

α ∈ PVect(E kp,q ), so that [α]k+1


p,q = α. Now, consider α ∈ E p,q with α ∼ [aκ , bκ ) and
k+1

proceed to obtain a ‘good’ total complex representative. There exists T (α) ⊆ E p,q k
$  %
together with coefficients cβα for all β ∈ T (α) such that α = 1aα β∈T (α) cβα β p,q .
k+1
 
Then we define  α = 1aα β∈T (α) cα β  , and notice that α = [ α ]k+1
p,q as well as
β
d Tot (
α ) ∈ F p−k (S Tot ). We denote 
α = (0, . . . , 0, α p−k+1 , . . . , α p , 0, . . . , 0). Now,
by hypotheses
$ %k
d kp,q (α) = d Tot (
α ) p−k,q+k−1
= [(0, . . . , 0, δ̄ p−k+1 (α p−k+1 ), 0, . . . , 0)]kp−k,q+k−1 = 0.

Then, there exist γ p−1 ∈ PVect(E k−1


p−1,q+1 ) with

p−1 $ %k−1
d p−1,q+1 (γ p−1 ) = d Tot (
α ) p−k,q+k−1 .

123
Discrete & Computational Geometry (2023) 70:580–619 615

k−1
By writing γ p−1 in terms of E p−1,q+1 and using their stored representatives, we may
get 
γ p−1 ∈ S p+q such that γ p−1 = [
Tot γ p−1 ]k−1 and also d Tot (
γ p−1 ) ∈ F p−k S Tot
p+q−1 .
In particular,
$ Tot %k−1
d ( α  (−
γk−1 )) =0

and we set  α ←  α  (− γk−1 ). Hence, by induction, we can repeat this proce-
dure for all 1 ≤ r ≤ k. Eventually, we should obtain a representative  α =
(0, . . . , 0, α p−k , . . . , α p , 0, . . . , 0) such that d Tot (
α ) ∈ F p−k−1 S Tot
p+q−1 . We denote
the new set of representatives as Ep,q k+1 .

5.3 Extension Problem

After computing all pages of the spectral sequence, we still have to solve the exten-
sion problem. Recall that a solution was given in Sect. 5.1; here we only give some
algorithmic guidelines. We start from a basis E p,q ∞ , with total complex representa-

 ∞
tives E p,q . Since we assume that the spectral sequence is bounded, it collapses at an
L > 0 page. Then, for each generator α ∈ E p,q L , with α ∼ [a , b ), we have a corre-
α α
sponding representative  
α ∈ E p,q . Consider 
L γ ← 1bα (
α ); we perform changes to  γ
similarly as in Sect. 5.1. We start by computing the classes [ γ ]rp,q for all 1 ≤ r ≤ L.
We do this by using Algorithm 1 in parallel, as done on the 1-page. This leads to a
1 ⊆ (E 1 )bα together with coefficients c1 ∈ F\{0} for all β ∈ J 1 and
subset J p,q β
p,q
  p,q
γ ]0p,q = dq+1 (Γ p )  1bα β∈J p,q
Γ p ∈ S p,q+1 , so that [ 1 cβ β . The same happens for
1

r ⊆ (E r )bα together with coefficients cr ∈ F\{0}


all pages 2 ≤ r ≤ L; we find J p,q p,q β
r −1
for all β ∈ J p,q
r and Γ
p+r −1 ∈ PVect (E p+r −1,q−r +2 ), so that

⎛ ⎞
−1
γ ]r −1 ) = d rp+r
1bα ([ −1,q−r +2 (Γ p+r −1 )  1bα


J
β∈
c β ⎠.
r
p,q
r
β

Now, we change the total complex representative by an assignment


⎛ ⎞

γ  ⎝−
γ ←
 
E
β∈
c
L
L ⎠
ββ .
p,q

In particular, notice that [


γ ] Lp,q = 0. We might repeat this procedure for all integers
L − 1 ≥ r ≥ 0 so that one has that [ γ ]rp,q = 0. As a consequence the p-component
of 
γ vanishes, so γ ∈ F p−1 S Tot p+q . Then, as in Sect. 5.1, one can repeat this process
for all p − 1 ≥ r ≥ 0. This leads to all coefficients (cβL )β∈E L for all 0 ≤ r ≤ p.
p−r ,q+r
These can be used to define a basis for the submodule B from end of Sect. 5.1; this
solves the extension problem by computing the corresponding quotient.

123
616 Discrete & Computational Geometry (2023) 70:580–619

5.4 Complexity Analysis

Let Ds be the maximum simplex dimension in K , and let L be the number of pages.
Denote by dim(N U ) the dimension of the nerve. Let

X = max {# q − simplices in Uσ }.
q≥0
σ ∈N U

On the other hand, we define

H = max {|E p,q


1
| : E p,q
1
is a base for E 1p,q }.
p,q≥0

Assume P is the number of processors.


0-Page When computing the first page, all we need to do is calculate persistent
homology in parallel. Then, the complexity is
& '
|N U |
O(X 3 ).
P

This leads to generators for the first page.


1-Page For the first page, recall that we start from a generator α ∈ E p,q
1 with α ∼
U | linear equations. Notice that this can be done
[aα , bα ) and proceed to solve |N p−1
for all generators from E p,q simultaneously. This is because as the value aα changes,
1

we might select the columns associated to (E p,q1 )aα from the columns of E 1 which
p,q
does not affect the complexity of Algorithm 1. On the other hand, we need to execute
image_kernel on at most dim(N U ) · Ds elements on the first page. Notice that
for each of these, we first compute a basis for the images and kernels. Afterwards, we
perform the quotients using box_gauss_reduce which takes a complexity of at
most O(H 3 ). Also, we need to add the complexity of the Čech differential. An option
for computing this, is to compare simplices in different covers by their vertices; two
simplices are the same iff they share the same vertex set. This would take less than
O(|N U | Ds X 2 H ) operations. Thus the overall complexity becomes
& ' & '
|N U | dim(N U ) · Ds  
O(X H ) +
2
O(|N U |Ds X 2 H ) + O(H 3 ) .
P P

k-Page Now, we proceed for the complexity of the page k ≥ 2. This is the same as
for the 1 page, with the addition of Gaussian eliminations of higher pages. These take
at most O(H 3 ) time. Denoting by L the infinity page, we have the new term
& '
dim(N U ) · Ds
O(L H 3 )
P

123
Discrete & Computational Geometry (2023) 70:580–619 617

which added to the complexity of the 1-page gives


& ' & '
|N U | dim(N U ) · Ds  
O(X H ) +
2
O(|N U | Ds X 2 H ) + O(H 3 ) + O(L H 3 )
P P
& '
dim(N U ) · Ds  
= O(|N U | Ds X 2 H ) + O(L H 3 ) .
P

Extension problem If the spectral sequence collapses at L > 0, then the complexity
of extending all generators in E p,q
L is bounded by that of computing the L page about

Ds times.
Overall complexity Altogether, we have a complexity bounded by that of computing
the first page plus that of computing the L page L + Ds times. Here the L comes from
computing the L page L times and Ds from the extension problem. Thus, the overall
complexity is bounded by
& ' & '
|N U | dim(N U ) · Ds  
O(X 3 ) + (L + Ds ) O(|N U |Ds X 2 H ) + O(H 3 ) .
P P

Notice that in general Ds , L and dim(N U ) are much smaller than H and X . Thus, for
covers such that |N U |  X , and assuming we have enough processors, the complexity
can be simplified to the two dominating terms O(X 3 ) + O(H 3 ). Notice that this last
case is satisfied for those covers whose mutual intersections are generally smaller than
each cover. Also, in this case H is approximately of the order of nontrivial barcodes
over all the input complex. This shows that PerMaViss isolates simplicial data, while
only merging homological information. It is worth to notice that in general H , being
the number of nontrivial bars, is much smaller than the size of the whole simplicial
complex.

6 Conclusion

We started by developing linear algebra for persistence modules. In doing so, we intro-
duced bases of persistence modules, as well as associated matrices to morphisms. Also,
we presented Algorithm 2, which computes bases for the image and the kernel of a
persistence morphism between any pair of tame persistence modules. Then a general-
ization of traditional persistent homology was introduced in Sect. 3.5. This theory has
helped us to define and understand the Persistent Mayer–Vietoris spectral sequence.
Furthermore, we have provided specific guidelines for a distributed algorithm, with
a solution to the extension problem presented in Sect. 5.1. The PerMaViss method
presented in Sect. 5.2 isolates simplicial information to local matrices, while merging
only homological information between different covers. Thus, the complexity of this
method is dominated by the size of a local complex plus the order of barcodes over
all the data. A first implementation of these results can be found in [29]. Coding an
efficient implementation from the pseudo-code given in this paper, and benchmark-
ing its performance compared to other methods, will be a matter of future research.

123
618 Discrete & Computational Geometry (2023) 70:580–619

Another interesting direction of research is how to merge this method with existing
algorithms, such as those from [7, 8, 21, 26]. Especially it would be interesting to
explore the possible interactions of discrete Morse theory and this approach, see [11].
Additionally, it will be worth exploring, both theoretically and practically, which are
the most suitable covers for different applications. Finally, we would also like to study
the additional information given by the covering. This will add locality information
from persistent homology.
Acknowledgements The author would like to thank his supervisor Dr. Ulrich Pennig who suggested this
topic and has been very helpful and supportive in the development of these ideas. Also, the author would
like to express his gratitude to EPSRC for the Grant EP/N509449/1 support with Project Number 1941653,
without which the author would not have been able to write this work.

Data availability There is no data associated to this article.

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which
permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give
appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence,
and indicate if changes were made. The images or other third party material in this article are included
in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If
material is not included in the article’s Creative Commons licence and your intended use is not permitted
by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the
copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

References
1. Adams, H., Emerson, T., Kirby, M., Neville, R., Peterson, C., Shipman, P., Chepushtanova, S., Han-
son, E., Motta, F., Ziegelmeier, L.: Persistence images: a stable vector representation of persistent
homology. J. Mach. Learn. Res. 18, # 8 (2017)
2. Bauer, U., Kerber, M., Reininghaus, J.: Clear and compress: computing persistent homology in chunks.
In: Topological Methods in Data Analysis and Visualization III (Davis 2013), pp. 103–117. Springer,
Berlin (2014)
3. Bott, R., Tu, L.W.: Differential Forms in Algebraic Topology. Graduate Texts in Mathematics, vol. 82.
Springer, New York (1982)
4. Bredon, G.E.: Sheaf Theory. Graduate Texts in Mathematics, vol. 170. Springer, New York (1997)
5. Carlsson, G.: Topology and data. Bull. Am. Math. Soc. 46(2), 255–308 (2009)
6. Chazal, F., de Silva, V., Glisse, M., Oudot, S.: The Structure and Stability of Persistence Modules.
SpringerBriefs in Mathematics. Springer, Cham (2016)
7. Chen, C., Kerber, M.: Persistent homology computation with a twist. In: 27th European Workshop on
Computational Geometry (Morschach 2011), pp. 197–200 (2011). https://eurocg11.inf.ethz.ch/docs/
Booklet.pdf
8. Chen, C., Kerber, M.: An output-sensitive algorithm for persistent homology. Comput. Geom. 46(4),
435–447 (2013)
9. Chow, T.Y.: You could have invented spectral sequences. Not. Am. Math. Soc. 53(1), 15–19 (2006)
10. Cohen-Steiner, D., Edelsbrunner, H., Harer, J., Morozov, D.: Persistent homology for kernels, images,
and cokernels. In: 20th Annual ACM-SIAM Symposium on Discrete Algorithms (New York 2009),
pp. 1011–1020. SIAM, Philadelphia (2009)
11. Curry, J., Ghrist, R., Nanda, V.: Discrete Morse theory for computing cellular sheaf cohomology.
Found. Comput. Math. 16(4), 875–897 (2016)
12. Delfinado, C.J.A., Edelsbrunner, H.: An incremental algorithm for Betti numbers of simplicial com-
plexes on the 3-sphere. Comput. Aided Geom. Des. 12(7), 771–784 (1995)
13. Di Fabio, B., Landi, C.: Persistent homology and partial similarity of shapes. Pattern Recognit. Lett.
33(11), 1445–1450 (2012)

123
Discrete & Computational Geometry (2023) 70:580–619 619

14. Edelsbrunner, H., Harer, J.L.: Computational Topology: An Introduction. American Mathematical
Society, Providence (2010)
15. Edelsbrunner, H., Letscher, D., Zomorodian, A.: Topological persistence and simplification. Discrete
Comput. Geom. 28(4), 511–533 (2002)
16. Ghrist, R.: Elementary Applied Topology. Createspace (2014)
17. Govc, D., Skraba, P.: An approximate nerve theorem. Found. Comput. Math. 18(5), 1245–1297 (2018)
18. Lewis, R., Morozov, D.: Parallel computation of persistent homology using the blowup complex. In:
27th ACM Symposium on Parallelism in Algorithms and Architectures (Portland 2015), pp. 323–331.
ACM, New York (2015)
19. Lipsky, D., Skraba, P., Vejdemo-Johansson, M.: A spectral sequence for parallelized persistence (2011).
arXiv:1112.1245
20. McCleary, J.: A User’s Guide to Spectral Sequences. Cambridge Studies in Advanced Mathematics,
vol. 58. Cambridge University Press, Cambridge (2001)
21. Milosavljević, N., Morozov, D., Škraba, P.: Zigzag persistent homology in matrix multiplication time.
In: 27th Annual Symposium on Computational Geometry (Paris 2011), pp. 216–225. ACM, New York
(2011)
22. Munkres, J.R.: Elements of Algebraic Topology. Addison-Wesley, San Francisco (2018)
23. Robins, V., Turner, K.: Principal component analysis of persistent homology rank functions with case
studies of spatial point patterns, sphere packing and colloids. Physica D 334, 99–117 (2016)
24. Robinson, M.: Topological Signal Processing. Mathematical Engineering. Springer, Heidelberg (2014)
25. de Silva, V., Ghrist, R.: Coverage in sensor networks via persistent homology. Algebr. Geom. Topol.
7(1), 339–358 (2007)
26. de Silva, V., Morozov, D., Vejdemo-Johansson, M.: Dualities in persistent (co)homology. Inverse Probl.
27(12), # 124003 (2011)
27. Singh, G., Mémoli, F., Carlsson, G.: Topological methods for the analysis of high dimensional data
sets and 3D object recognition. In: Eurographics Symposium on Point-Based Graphics (Prague 2007),
pp. 91–100. Eurographics Association (2007)
28. Skraba, P., Vejdemo-Johansson, M.: Persistence modules: algebra and algorithms (2013).
arXiv:1302.2015
29. Torras Casas, Á.: PerMaViss: persistence Mayer Vietoris spectral sequence (2020). https://doi.org/10.
5281/zenodo.3613870
30. Weibel, C.A.: An Introduction to Homological Algebra. Cambridge Studies in Advanced Mathematics,
vol. 38. Cambridge University Press, Cambridge (1994)
31. Yoon, H.R.: Cellular Sheaves and Cosheaves for Distributed Topological Data Analysis. PhD thesis,
University of Pennsylvania (2018). https://repository.upenn.edu/edissertations/2936
32. Yoon, H.R., Ghrist, R.: Persistence by parts: multiscale feature detection via distributed persistent
homology (2020). arXiv:2001.01623
33. Zomorodian, A., Carlsson, G.: Localized homology. Comput. Geom. 41(3), 126–148 (2008)

Publisher’s Note Springer Nature remains neutral with regard to jurisdictional claims in published maps
and institutional affiliations.

123

You might also like