Quantum Algorithm For The Fixed-Radius Neighbor Search
Quantum Algorithm For The Fixed-Radius Neighbor Search
Luca Cappelli∗
Dipartimento di Fisica dell’Università di Trieste, via Tiepolo 11, I-34131 Trieste, Italy;
INAF - Osservatorio Astronomico di Trieste, Trieste, Italy;
Giuseppe Murante
INAF - Osservatorio Astronomico di Trieste, via Tiepolo 11, I-34131 Trieste, Italy;
(Dated: July 8, 2025)
The neighbor search is a computationally demanding problem, usually both time- and memory-
consuming. The main problem of this kind of algorithms is the long execution time due to cache
misses. In this work, we propose a quantum algorithm for the Fixed RAdius Neighbor Search
problem (FRANS) based on the fixed-point version of Grover’s algorithm. We derive an efficient
circuit for solving the FRANS with linear query complexity with the number of particles N . The
quantum circuit returns the list of all the neighbors’ pairs within the fixed radius, together with their
distance, avoiding the slow down given by cache miss. We explicitly write the Grover’s operator and
1
analyze its gate complexity. The whole algorithm has complexity of O(M 2 N 2 ) in the worst-case
scenario, where M is the number of neighboring pairs, and uses O(log N ) number of qubits. By
employing extra ancilla qubits the depth of the circuit can be brought down to O(N log N ) at the
cost of O(N ) qubits for unstructured dataset, or O(poly(log N )) qubits for structured datasets.
Finally we assess the resilience of the model to the readout error, suggesting an error correction-free
strategy to check the accuracy of the results.
I. INTRODUCTION
In computational science, a seemingly straightforward task has emerged as a fundamental limitation: finding all
particles within a fixed distance between each other. This Fixed-RAdius Neighbor Search (FRANS) [1] is strictly
related to N -body methods and forms the computational backbone of numerous simulation methods across physics,
chemistry, and astronomy [2–6], yet consistently dominates runtime despite decades of algorithmic advancements.
The naive approach to finding all particles within a cutoff distance h requires examining every possible particle pair,
resulting in O(N 2 ) scaling that quickly becomes unfavorable as the system size grows. Modern simulations employ
specialized spatial data structures such as uniform grids, Verlet lists [2, 5, 7], and hierarchical trees [4, 8], which can
reduce the complexity to O(N log N ) by considering only a portion of all the possible pairs.
Fast algorithms used to explicitly find all the particle pairs within the fixed-radius scale as O(M + N ), where M
is the number of neighboring pairs, with a preprocessing of O(N log N ) and a O(N ) space complexity [9]. Yet, in
spite of the improved computational cost, this optimized neighbor search still constitutes the primary computational
expense across diverse scientific applications.
In the context of Molecular Dynamics (MD), Verlet lists or cell indices [2] are used to evaluate interactions between
atoms for the study of a wide range of phenomena such as drug binding [10, 11], membrane dynamics [12, 13]
and material science [14]. These data structure keep track of the atoms that are spatially close, in order to avoid
considering atoms far beyond the interaction radius. The FRANS procedure and maintenance of these lists typically
consumes from 30% to 60% of total runtime in large-scale simulations [15, 16]. Another interesting field of MD
concerns protein folding. Research in this area focuses on how proteins evolves towards an equilibrium state (folding)
∗ [Link]@[Link]
† [Link]@[Link]
‡ [Link]@[Link]
2
and interact with one another. A noticeable application is found in neuroscience, where abnormal folding patterns
often signal neurological diseases [17]. While earlier studies relied on neighbor search [18], the field has evolved
considerably with new approaches. AlphaFold [19] stands out as a particularly valuable tool, serving as a detailed
simulator and providing a database of 200 million proteins’ structure. In this case, neighbor search algorithms
remain useful for homology detection [20], helping researchers identify structural similarities between proteins to
trace evolutionary origins and primitive forms. These methods also assess how well different proteins might interact
based on their geometric properties [21]. Non-equilibrium protein folding [22–24] is an open research field, also related
to neurological disorder [25, 26] and represents another area of interest for neighbor search. Here environmental
influences cause proteins to adopt conformations outside thermal equilibrium. Since ground-state assumptions do not
apply in these systems, different methods are used to model the folding mechanism such as coarse-grained molecular
dynamics simulations combined with FRANS.
In both cosmology and fluid dynamics, Smoothed Particle Hydrodynamics (SPH) [3] can be used to simulate the
evolution of fluids and gases. This approach represents continuous media as discrete particles, each carrying funda-
mental properties such as mass and velocity. Particle interactions emerge through a weighting scheme that employs
localized kernel functions to determine the influence of neighboring particles. The method achieves conservation of
field quantities (e.g, pressure and density) by spatially averaging contributions from all particles within the kernel’s
smoothing length radius. The evaluation of neighboring pair and interaction between particles takes up most of the
execution time [27]. The challenge becomes particularly acute in simulations with free surfaces or multiphase flows,
where particle distributions become highly non-uniform. In astrophysical N-Body simulations, SPH is used to simulate
dust dynamics using tree-based methods for the neighbor search part. These data structures are then used also to
evaluate long-range gravitational interactions efficiently,using a fast multiple approach. In both cases, the traversal
of these spatial data structures to identify nearby particles still dominates computational expense, consuming most
of simulation’s time in cosmological models with billions of bodies [4, 28]. Highly clustered mass distributions –
characteristic of galaxy formation – further amplify these costs by necessitating frequent tree rebuilds.
What makes this problem particularly resistant to optimization is not merely its algorithmic complexity, but rather
the challenging computational patterns it creates. Specifically, the combination of a large data structure combined
with the frequent and irregular memory access patterns.
To be more specific, modern computing architectures are built around a memory hierarchy (CPU registers → L1
cache (32 − 64 KiB per core) → L2 cache ( 512KiB −8 MiB per core) → L3 cache (120 − 256 MiB per core) → main
memory; data collected from [29, 30]) that performs best when data access follows predictable patterns. However,
neighbor search fundamentally involves unpredictable memory access and particles that are close in space configuration
might be stored far apart in memory, particularly after many simulation time steps have caused the particles to move
from their initial positions. This creates numerous cache misses, as the processor constantly needs to fetch data from
slower memory tiers. For reference, in both SPH and MD the number of neighbors M depends on the application,
as high resolution simulations might require higher values. In the common scenario however, an accurate 3D SPH
simulation requires approximately N = 1010 , with approximately 400 target neighbors per particle [31], while an MD
simulation might require N = 106 with the number of neighbors in the order of hundreds [5, 32, 33].
In Fig. 1 is presented a toy model representing the cache miss problem. When searching for neighbors of particle B,
the algorithm needs to access data for particles E, G, and H that happen to be spatially nearby. However, due to limited
cache memory capacity, only particles within B’s sub-quadrant are currently stored in cache. To retrieve information
for the remaining particles, the algorithm must fetch data from main memory—a significantly slower operation. The
frequent repetition of this process creates substantial computational overhead and dramatically increases execution
time. Thus, the mismatch between physical proximity and memory proximity causes the FRANS to be the bottleneck
in many different scenarios.
An alternative approach to solving the bottleneck problem can be offered by the framework of quantum compu-
tation [34] which in the last two decades has seen a surge of theoretical advances and potential applications [35] in
quantum chemistry [36], material science [37], cosmology [38] and computational fluid dynamics [39].
At the dawn of quantum computing, Grover’s search algorithm [40] showed one of the first example of a quantum
algorithm that can run faster than any classical algorithm designed for the same task. While √ the computational
complexity of a classical search is O(N ), Grover’s algorithm can do the search with only O( N ) steps, thus offering
a quadratic advantage. After its appearance in the literature, Grover’s algorithm has been analyzed and modified
several times to match more specific tasks and to tackle the inherited issues of the original algorithm. Among those,
we mention the Fixed-point search (FPS) algorithms [41–43], thought to tackle the periodic increase and decrease in
the success probability of the search, and the oblivious amplitude amplification algorithms(OAA) [44–46] intended to
overcome the issue of not knowing the target state in advance. We will discuss both approaches later in the main
text. In more recent years, it has been shown how Grover’s search algorithm and all of its many versions belong to
the class of Quantum Singular-value Transform (QSVT) algorithms [47, 48], which reveals the principles behind the
most widely used (and cited) quantum algorithms.
3
In this work, we aim to find a solution to the FRANS problem by employing our own version of the oblivious-
fixed-point amplification algorithm [42, 45], which is very adaptable to different initial states while being agnostic
to the target state. The manuscript is structured as follows. In Section II we introduce the problem and describe
its difficulties when treated by a classical computer. In Section III we describe the Grover’s algorithm and its FPS
and OAA versions. In Section IV we describe our quantum FRANS algorithm and we highlight the similarities and
differences with the previous algorithms. Hence, we explicitly show the quantum circuit and the gate decomposition
of the various operations, analyzing the computational complexity and the gate complexity. In Section V we introduce
the bit-flip and the measurement errors in our simulation, calculating the threshold for the amount of noise required
to preserve the quantum advantage of the algorithm, without introducing particular error correction schemes. Finally,
we draw the outlooks and the conclusions of our work in Section VI.
At the heart of many computational models in physics lies a conceptually simple yet computationally intensive
operation: determining which objects in space are close to one another. This procedure, known as neighbor search,
forms the foundation for virtually all particle-based simulation methods across the physical sciences.
In its most fundamental form, the fixed-radius variant, neighbor search answers a straightforward question: given
a collection of particles distributed throughout space, which particles lie within a fixed distance of each other? This
critical calculation enables the modeling of interactions that occur only between objects in close proximity—forces
that diminish rapidly with distance, collisions between bodies, or influence that spreads within a limited radius.
The mathematical formulation is deceptively simple. For each particle i in a system of N particles, we must
identify all other particles j such that the distance between them d(ri , rj ) is less than some threshold distance ξ. This
distance threshold might represent the cutoff radius of a potential energy function in MD, the smoothing length in
fluid simulations.
Despite its conceptual simplicity, this operation presents an extraordinary computational challenge. The most direct
approach—checking every possible pair of particles—would require examining N 2 distance calculations, a prohibitive
scaling for systems of scientific interest that may contain millions or billions of particles. Moreover, as particles move
throughout a simulation, these neighborhood relationships continually change, requiring repeated recalculation.
To address this challenge, researchers have developed sophisticated spatial data structures that organize particles
based on their positions in space. Cell lists divide the simulation domain into a grid, allowing the search to focus
only on particles in adjacent cells. Verlet lists maintain a precomputed set of neighbors for each particle, including a
buffer zone to reduce update frequency. Tree-based methods hierarchically partition space, enabling rapid elimination
of distant regions from consideration.
These techniques reduce the theoretical complexity from O(N 2 ) to O(N log N ), representing an enormous com-
putational saving. Nevertheless, neighbor search still typically accounts for 30 − 70% of total computation time in
production simulations across disciplines [15, 16, 27]. The persistent challenge stems from a fundamental disconnect
between two different types of proximity: spatial proximity in the physical simulation and memory proximity in the
computing hardware.
When particles are close to each other in the simulation space, their data should ideally be close to each other
in the computer’s memory for efficient processing. However, this alignment rarely occurs, especially as particles
move throughout the simulation. A particle’s neighbors in physical space are often scattered across distant memory
locations, forcing the processor to constantly fetch data from widely separated memory addresses.
Compounding this problem is the repetitive nature of the search operation. For each of the N particles in the
system, the algorithm must perform a separate neighbor search, repeatedly accessing memory locations across the
entire data structure.
The combined effect of these two factors – spatial-memory misalignment and repeated broad memory access –
frequently results in cache misses, where the processor must wait to retrieve data from slower memory tiers rather
than finding it in fast cache memory. These waiting periods significantly increase actual execution time, often by an
order of magnitude or more compared to theoretical predictions.
Thus, the primary challenge in modern neighbor search implementations has shifted from algorithmic complexity
to hardware efficiency. Even algorithms with optimal theoretical scaling (O(N )) can perform poorly in practice due
to their incompatibility with contemporary computing architectures and the true complexity now lies not in the
mathematical operation count but in navigating the complex memory hierarchy of modern processors to minimize
data movement costs.
4
A
A G
B H
B G H
D E
C E
C
D F
F
FIG. 1: Toy model illustrating the cache miss problem in neighbor search algorithms. An evenly spaced grid
contains eight particles distributed across four memory blocks (indicated by red boundaries). Each particle’s data is
stored in memory locations denoted by matching colors. Note that spatially adjacent particles might be stored in
different memory locations.
In this section, we describe a few versions of the Grover’s algorithm as they are relevant to this paper. In the
original work, the N elements of a database D are encoded into a quantum state |i⟩ and we want to select a specific
target state |ψT ⟩. The initial state is set as the uniform superposition of all possible states
1 X
|ψ0 ⟩ = √ |i⟩. (1)
N i∈D
The algorithm consists in applying to the state |ψ0 ⟩ first an oracle operator Ô, which is capable of recognizing the
solution and whose action is to flip the sign of the target state |ψT ⟩, defined as
n
|ψ0 ⟩ Ô R̂0 ...
FIG. 2: Circuit used for the amplitude amplification process. Based on Mizel search algorithm. The parameter α is
adjusted accordingly to the iteration. The measurement on the ancilla qubit serves as control, if |0⟩ is the outcome,
it means that the quantum register is ready for measurement, otherwise repeat the process, varying the angle α.
A. FPS algorithm
The FPS algorithm [42] introduces an ancilla qubit to reproduce a non-unitary dynamics that damps out the
oscillations of the results of Grover’s algorithm between the target and non-target states. The algorithm uses a series
of parametric rotations
where the optimal value of the αi angles depends on the specifics of the problem. The algorithm is presented in Fig. 2
and reads as follows
ΠK
i=1 |0⟩⟨0| ⊗ I + |1⟩⟨1| ⊗ R̂0 Ry (−αi ) ⊗ I) |0⟩⟨0| ⊗ I + |1⟩⟨1| ⊗ Ô Ry (αi ) ⊗ I) |1⟩|ψ0 ⟩, (6)
where the circuit is run a total number of K times depending on the result of a measurement performed on the ancilla
qubit as we see next.
The first step of the algorithm rotates the ancilla qubit to − sin α2i |0⟩ + cos α2i |1⟩. The second term performs a
controlled operation which yields the state
αi αi
|Ψ1 ⟩ = − sin |0⟩|ψ0 ⟩ + cos |1⟩(cos θ|ψ ⊥ ⟩ − sin θ|ψT ⟩). (7)
2 2
α α
|Ψ2 ⟩ = − sin αi sin θ|0⟩|ψT ⟩ + sin2 |1⟩|ψ0 ⟩ + cos2 |1⟩ cos θ|ψ ⊥ ⟩ − sin θ|ψT
2 2 (8)
= − sin αi sin θ|0⟩|ψT ⟩ − cos αi sin θ|1⟩|ψT ⟩ + cos θ|1⟩|ψ ⊥ ⟩.
Now the action of the last controlled operation |0⟩⟨0| ⊗ I + |1⟩⟨1| ⊗ R̂0 on |Ψ2 ⟩ is given by
√
|Ψ3 ⟩ = − pi+1 |0⟩(|ψT ⟩) + 1 − pi+1 |1⟩ (ci cos 2θ − si cos αi sin 2θ)|ψ ⊥ ⟩ + (ci sin 2θ + si cos αi cos 2θ)|ψT ⟩ ,
p
(9)
where we have used pi+1 = sin2 αi sin2 θ, ci = cos θ and si = sin θ. Finally, depending on the outcome of the
measurement performed on the ancilla qubit, we decide to stop or continue the algorithm. In fact, if we measure the
state |0⟩ on the ancilla qubit, the system collapses into the state |ψT ⟩. Conversely, if the outcome is |1⟩, the system
becomes a new superposition of the states |ψT ⟩, |ψ ⊥ ⟩.
At the i-th repetition, the superposition state associated with |1⟩ is
for i = 1, . . . , K and
i
Y
pcum (i) = 1 − 1 − pj+1 for i = 1, ..., K (13)
j=1
X i−1
Y
< ncalls >= i × pi+1 (1 − pj+1 ) , (14)
i j=1
where the term between squared brackets is the probabiltiy of having success exactly at the i-th iteration.
A sensible choice of the αi leads the cumulative probability of success to rapidly grow to 1. In the case of knowing
beforehand the value of M , and consequently of θ, the best choice is given by the constant critical value
1 − sin 2θ
αC = arccos , (15)
1 + sin 2θ
which has been found to be the optimal angle dampening the oscillations of Grover’s algorithm between target and
nontarget states[42]. When M is not known in advance but M ≪ N , then the critical angle computed for M = 1 is an
effective choice for a large set of values of M near 1. It is noteworthy to mention that the quantum counting algorithm
has been proposed as a method to efficiently count the number of solutions that attain specific requirements of a given
quantum search problem or simply counting the total number of solutions [49]. More recently, a new method has
been proposed to estimate M by exploiting the relationship between this value and the probability of measuring one
solution after a given number of iterations of the standard Grover’s algorithm [50].
We show next, for the case of M = 1 and N = 1000, the evolution of the coefficients ci , si together with the
instantaneous and cumulative probabilities pi , pcum (i) for different choices of αi . In the case of using a constant
critical angle, shown in Fig. 3c, the probability of success at each iteration pi tends to increase. However, when the
number of target states M is not known but it is supposedly very large, an effective choice is given by varying the
angle in the following decreasing way
1 − sin(π/2i)
α1 = π/2, αi = arccos for i > 1, (16)
1 + sin(π/2i)
for which the dynamics is plotted in Fig. 3d. In Fig. 3b, we compare the cumulative probabilities and the average
number of oracle calls < ncalls > before success (vertical lines) for different choices of α (critical and decreasing) with
the probability of classical random extraction (dashed line). Comparing the two different choices of α, the cost of
not knowing M weights at most by a factor of 1.5 in the average number of oracle calls. Summing everything up,
in Fig. 3a we compare the average number of oracle
p calls for the case M = 1 as a function of N for the critical and
decreasing cases with respect
p to the references N/M and N/M . As expected, we find that the fixed point search
algorithm has a scaling O( N/M ), an improvement with respect to the classical O(N/M )
7
1.0
c
103 v 0.8
N
N c
0.6 ncalls = 24.21
102
pcum
ncalls
v
0.4 ncalls = 35.11
classical
101
0.2
100 0.0
102 103 104 0 50 100 150 200
N i
(a) (b)
1.0
1.00
0.8 0.75
0.50
0.6 pcum
pi 0.25
si 0.00
0.4 ci 0.25 pcum
0.2 0.50 pi
si
0.75 ci
0.0 1.00
0 50 100 150 200 0 50 100 150 200
i i
(c) (d)
FIG. 3: In (a), the average number of oracle calls before success (using M = 1 and changing the database size N )
for the constant critical value of αC = αi and √ for the variable decreasing sequence, defined in (16) and referenced as
αV . This is compared with the references N and N using a logarithmic scale on both the horizontal and vertical
axes. The fixed point search algorithm provides a quadratic advantage with respect to classical search, for the
considered angle’s schedules. In (b), a comparison between the cumulative probabilities for the two choices with
respect to a classical algorithm. The vertical lines represent the average number of queries to the oracle for the
respective cases. In (c) and (d) the time evolution of the coefficients in Eq. (11), the instantaneous and cumulative
probabilities pi , pcum for the case M = 1, N = 1000 with αC and αV respectively.
B. OAA algorithm
In order to apply either the Grover or the FPS algorithm, we need to know how to prepare the oracle Ô as expressed
in Eq. (2), which assumes some knowledge of the target state |ψT ⟩. Most of the times, we do not have all the pieces of
information about the target state, but some of its values, which are stored in a subspace that labels if the quantum
state is (or it is not) the target. Namely, we can think of a unitary operation Û such that
where |ψT ⟩ = V |ψ⟩ has the label qubit set to |0⟩ and |Φ⊥ ⟩ is a state orthogonal to |0⟩|ψT ⟩ with different value of the
label qubit. The OAA algorithm allows the amplification of the state labeled by |0⟩ regardless of the quantum state
|ψT ⟩. This is a typical condition that we encounter in block-encoding algorithms [47] where the relevant dynamics is
labeled by the |0⟩ state.
8
We refer to the original paper [44] for the description of the algorithm. However, as a caveat, we remind that the
OAA algorithm is exact only if the V operator is unitary [46], otherwise introducing errors into the target state.
In this section we present our Quantum algorithm for the FRANS problem (QFRANS), which is obtained as an
efficient and ad-hoc modification of the FPS algorithm. Later in this section we present a full analysis of each
component of the quantum circuit.
We consider a dataset X = {i, xi }N i=1 which collects the label i given to N particles and the respective position xi .
Our goal is to find all the pairs (i, j) such that the distance d(i, j) is lower than the chosen fixed radius h.
The first part of the algorithm encodes two copies of the dataset in four quantum registers, by using bit-encoding.
In particular, two quantum registers with q0 = ⌈log2 N ⌉ qubits are used to encode the label of the particles, with
the state |i⟩ being the binary representation of the integer i, while q1 qubits are used to encode their position by
discretizing the whole space into a 2q1 -points lattice, and letting the state |xi ⟩ to represent the coordinates.
The preparation of the state is accomplished by employing the PREP operator, which acts on the position and
label registers as
1 X
|ϕ⟩ = PREP|0⟩q0 |0⟩q1 = √ |i⟩q0 |xi ⟩q1 , (18)
N i
where the suffix given to the ket details the number of qubits. QFRANS applies PREP to both the particles register,
yielding
1 X
|ψ0 ⟩ = PREP ⊗ PREP|0⟩q0 |0⟩q1 |0⟩q0 |0⟩q1 = |i⟩ |xi ⟩q1 |j⟩q0 |xj ⟩q1 , (19)
N i,j q0
Once the dataset is encoded we evaluate the distance between all the points in the dataset dij = d(xi , xj ). This can
be done by introducing a set a1 of ancillary qubits initialized in |0⟩ and by applying a difference operator D̂ on the
two q1 -qubits registers, which returns the value dij = k |xki − xkj |, where k runs over the dimensions of our system,
P
as
1 X
|ψ1 ⟩ = (I2q0 ⊗ D̂)|ψ0 ⟩|0⟩a1 = |i⟩ |xi ⟩q1 |j⟩q0 |dij ⟩q1 |·⟩a1 . (20)
N i,j q0
The value |·⟩a1 of the ancilla qubits is not relevant at this stage of the calculation, but we cannot discard it as it
will be useful later.
Once the state is prepared, according to Grover’s algorithm and its FPS counterpart, now we construct the oracle
operator that applies a negative phase to the states where dij ≤ h.
Here resides the novelty of this work: we do not know what is our target state |ψT ⟩, so we do not aim to build the
oracle operator as described in Eq. (2), nevertheless we build the oracle by constructing a diagonal operator which is
able to invert the sign of all the possible solutions.
Ô(h) = diag −1, . . . , −1, 1, . . . , 1 . (21)
| {z } | {z }
h times 2(q1 +1) −h times
|0⟩q0
PREP Z̄
|0⟩q0
Ô
|0⟩a1
Û1
FIG. 4: The circuit for the QFRANS algorithm, which has to be repeated until the state measured in the register
a0 is |0⟩.
where M is the number of neighboring pairs and in the last line of Eq. (22) we restored the convention that uses
|ψ ⊥ ⟩, |ψT ⟩ and the relative angle θ.
Finally we build the reflection over the initial state |ψ1 ⟩, R̂1 , similar to Eq. (3) as
† †
R̂1 = 2|ψ1 ⟩⟨ψ1 | − I = Û1 (2|0⟩⟨0| − I)Uˆ1 = Uˆ1 Z̄ Uˆ1 , (23)
where we defined Û1 = D̂Û , and Z̄ as the expanded Ẑ operation which inverts the sign of the qubits different from
|0⟩. The application of R̂1 to the quantum state leads to |ψ3 ⟩ = R̂1 |ψ2 ⟩. Upon this premises, we are able to build the
QFRANS algorithm as depicted in Fig. 4, which is analogue to the FPS algorithm, with an explicit state preparation
for the FRANS problem, and a modified oracle operator. Although those differences, the result can be written as in
Eq. (9), with
√
|Ψ3 ⟩ = − pi+1 |0⟩|ψT ⟩ + 1 − pi+1 |1⟩ (ci cos 2θ − si cos αi sin 2θ)|ψ ⊥ ⟩ + (ci sin 2θ + si cos αi cos 2θ)|ψT ⟩ ,
p
(24)
In the remaining of this section we analyze each of the components of the algorithm separately.
A. State preparation
Here we describe the Û1 operator needed to encode and process the pieces of information available in the database.
The fundamental premise of our algorithm relies on establishing a correspondence between dataset elements Xi
and their integer representations xi . When working with floating-point precision, this correspondence can always be
achieved by defining a discretization grid based on the numerical precision of the computing system. More sophisticated
grid selection schemes can be tailored to specific problem requirements and search radius ξ to optimize performance.
To identify and mark the target states that satisfy the proximity condition in Eq. (28), we must first determine
the integer representation of the chosen distance threshold ξ. This conversion is feasible precisely because we have
established a bit-encoding scheme that associates each dataset point with a unique integer value.
A critical requirement of our approach is that the input dataset must be evenly spaced, or equivalently, that the data
space can be accurately described using a regular grid structure. Under this constraint, we can interpret the integer
representation as the number of discrete distance units required to span the threshold distance ξ. For illustration,
consider the one-dimensional case where this relationship is expressed as:
10
√1
P
N i |i⟩q0 |0⟩q1
|0⟩q0 L̂
|0⟩q0 +q1 |ϕ⟩q0 +q1 = √1
P
PREP Ê N i |i⟩q0 |xi ⟩q1
|0⟩q1
FIG. 5: A possible implementation of the PREP operator, where L̂ creates a balanced superposition of the labels of
the particles and Ê assigns to each particle its position.
ξ
h= , (25)
∆x
where ∆x represents the grid spacing. This discretization procedure effectively transforms the original continuous
proximity problem into an equivalent integer formulation:
d(xi , xj ) ≤ h , xi , xj , h ∈ N . (26)
Hence, We assume that both the label and the position of each particle are described by integers. Note that the ceiling
function in Eq. (25) is suitable to a proximity condition that uses the inclusive inequality as in Eq. (26), whereas, for
applications requiring a strict inequality, the floor function should favorable.
In the following we treat PREP as a black-box operator that is able to prepare the state |ϕ⟩ as defined in Eq. (18).
The construction of PREP is often disregarded but it is actually one of the hardest element to deal with. The quantum
superposition of N states can be obtained by employing a circuit with depth O(N ), thus exponential with the number
of qubits q0 of the label register [51, 52]. Note that by using the optimized algorithm of Ref. [51], we can implement
PREP as two separate circuit, as shown in Fig. 5. The first operator, L̂, defines the balanced superposition of N
PN −1
elements in the register with q0 qubits L̂|0⟩q0 = √1N i=0 |i⟩. The second operator, Ê, assigns to each particle its
own position, Ê|i⟩q0 |0⟩q1 = |i⟩q0 |xi ⟩q1 .
This implementation of PREP allows us to use a reduced R̂1 operator, as we will see later.
One can drastically reduce the depth of the circuit by increasing its width, i.e. by employing ancilla qubits [53–55],
an approach that culminates with a circuit that has width∼ O(N ) while keeping the depth∼ O(log N ). In general,
different algorithms play with the tradeoff between depth and number of ancillas [54], while keeping the product
depth×width about constant. In order to build a functioning PREP operator, one should adapt the depth and the
width to the specifics of the quantum computer available.
Conversely, if the position of the particles follows some structure, that is it can be described by a mathematical
function, we can use matrix access oracles with depth∼ O(poly log N ) [55–57].
The distance operator D̂ is applied on a combination of two identical states superpositions |ϕ⟩ |ϕ⟩
X X
D̂ |xi ⟩q1 |xj ⟩q1 |0⟩a1 = |xi ⟩q1 |dij ⟩q1 |0⟩a1 , (27)
i,j i,j
where |dij ⟩ is the state encoding the distance between the particle xi and xj , and the ancilla qubit is reset to the
initial value |0⟩.
We consider a one-dimensional system where particles are positioned along a line segment. Here, the distance
between two particles corresponds to the absolute value of their coordinate difference. Notice that when the system
is two or three-dimensional, the same approach can be used to define the distance as the sum of the absolute value of
the differences along all the directions in the system. This corresponds to consider the neighbors of the i-th particle
11
as the particles lying in the square (or cube) of side 2h centered around xi , as opposed to considering the circle (or
the sphere) of radius h when the Euclidean distance is used.
Instead of using the absolute difference between particles, we compute the signed difference. In this way, we can
easily discard the negative distances, avoiding counting twice for the same pair and halving the effective number of
neighbors. For two binary values a and b, we calculate their difference using the inverse Ripple Carry Adder [58],
which requires O(q1 ) CNOT gates and only 2 ancilla qubits: one for the carry of the subtraction and the other as a
clean ancillary qubit. When a > b, the carry qubit remains in state |0⟩ and yields the correct result. When a < b, the
carry qubit flips to |1⟩, producing the two’s complement representation of a − b. We can thus consider the carry bit
as the sign qubit and effectively as part of the distance.
B. The oracle Ô
The oracle presented in Equation (21) represents the most naive implementation, as it incorporates elements with
zero distances. This becomes particularly valuable when different particles are mapped to identical positions due to
finite precision constraints. It is also the cheapest option in terms of ancilla qubits, number of CNOT and circuit
depth.
The most direct approach to implementing the oracle Ô employs the diagonal decomposition method described in
Reference [59]. However, this approach exhibits unfavorable scaling properties, with both the number of CNOT gates
and circuit depth scaling as O(2q1 +1 ), rendering it impractical for large-scale implementations.
An alternative approach would be to implement the oracle as a series of multi-controlled Z gates (MCZ). In this
scenario, we would require O(h) MCZ gates. Using the approach described in [60, 61] the global scaling of the oracle
would be O(hq1 ).
We can achieve a substantial improvement, reducing the complexity to O(q1 ), by exploiting the inherent structure
of our problem. In fact, we propose an oracle implementation that utilizes an integer comparator circuit (COMP),
which requires q1 − 1 clean ancilla qubits and one additional qubit to store the comparison result dij ≥ h. When this
condition
P is satisfied, thetarget qubit is found in state |1⟩. The comparator’s operation can be formally expressed as
P
COMP ij |dij ⟩q1 +1 |0⟩ = ij |dij ⟩q1 +1 |dij ≥ h⟩. Choosing the most significant qubit as the sign bit, we create a
natural filtering mechanism. By adopting this choice, when we have a negative distance dij , the sign qubit is set to
|1⟩, which automatically makes the entire numerical value at least 2q1 . Since all negative values are now greater than
2q1 − 1, the comparator naturally excludes them without any additional circuit. This elegant consequence of our sign
bit placement simplifies the overall implementation.
To optimize circuit complexity, we implement the inverse oracle −Ô rather than the oracle defined in Equation (21).
This is achieved through a phase kickback mechanism: we first apply the quantum comparator to the distance register,
then apply a Z gate to the comparator’s target qubit, and finally reset the target qubit to |0⟩ by applying COMP† .
This procedure introduces a negative phase to all elements whose distance satisfies dij ≥ h. Using this approach the
Oracle requires only q1 clean ancilla qubits.
However, when one is faced with large datasets, e.g, N = 109 particles, we suggest to employ a slightly different
oracle
Ô(h) = diag 1, −1, . . . , −1, 1, . . . , 1 . (28)
| {z } | {z }
h−1 times 2(q1 +1) −h times
where we introduced a modification to exclude zero-distance values |0⟩. This eliminates comparisons between identical
data points, reducing the number of solutions by N . In practical applications, e.g, cosmological simulations where
N ∼ 1010 , this optimization provides significant computational advantages.
Figure 6 illustrates the quantum circuit implementation of −Ô from Equation (28). Zero values are excluded from
the marked states by replacing the standard comparator COMP with COMP (−1), c where −1 c represents the inverse
quantum incrementer based on a logarithmic depth quantum adder [62]. To maintain linear scaling in both CNOT
count and circuit depth with respect to q1 , this modified approach requires 2q1 − log2 q1 clean ancilla qubits [63],
which allows us to use the same ancilla qubits for both the comparator and the incrementer.
In Fig. 7 we show a comparison of the depth and number of ancilla qubits required to implement both version
of the oracle. We have to consider also that the fewer the number of target √ states M , the harder is to find them.
Specifically the number of required repetition is expected to scale as O(N/ M ) (cf. Sec III, as N 2 is the number of
all the possible pairs of the dataset with N elements.
12
||dij |⟩
sign −1
c +1
c
COMP †
COMP
|0⟩a1
target Z
FIG. 6: Quantum Oracle. The circuit realizes the inverse oracle −Ô defined in Equation (28). It is important for
the circuit to work correctly that the sign qubit is the most significant qubit. The target register stores the
comparison result dij ≥ h. Removing the incrementer +1 c and decrementer −1 c operations recovers the expression in
Equation (21). The original oracle Ô can be recovered by replacing the Z gate with −Z.
No ZERO 50 No ZERO
With ZERO With ZERO
No ZERO: 48.66 q 1 + 24.99 No ZERO: 3.35q1 − 3.35 log2(q1) + 3.58
800
600
30
400
20
200
10
4 6 8 10 12 14 16 18 4 6 8 10 12 14 16 18
q1 q1
(a) (b)
FIG. 7: Depth (a) and total number of qubits (b) as a function of the number of qubits q1 . The green line refers to
the oracle built using the comparator for the case including Zero as in Eq. (28)); the blue line to the case without
zero as in Eq. (21).
C. The reflection
The final step of the QFRANS algorithm requires performing an inversion around the initial state |ψ1 ⟩. As demon-
strated in Eq. (23) and illustrated in Figs. 4 8, this can be achieved through the implementation of a single operator
Z̄.
To maintain consistency with our circuit optimization strategy introduced in Section IV B, where we implemented the
negative oracle −Ô rather than the positive version, we adopt the same approach here by constructing −R̂1 instead
of R̂1 . This design choice ensures that the accumulated negative phases from both the oracle and reflection operators
cancel out, ultimately producing results that are identical to the standard implementation presented in Section IV.
Under this modified framework, the required operator becomes −Z̄ = I − 2 |0⟩ ⟨0|, which can be efficiently realized as
a multi-controlled Z gate that activates when all control qubits are in the |0⟩ state.
Figure 8 presents an explicit representation of the R0 implementation. As established in Section IV A 1 and illus-
trated in Figure 5, PREP can be factorized as the product of label creation operator L̂ and element creation operator
Ê. Consequently, following application of Z̄ˆ , the complete Û operator need not be applied in all cases. When mea-
surements target only the labels of dataset elements, applying L̂ suffices to create the particle labels, whereas Û would
reconstruct a superposition of particle elements and distances. However, when the ancilla measurement yields |1⟩,
correct algorithmic behavior requires the full Û operator. This is achieved by applying the remaining components Ê
and D̂ to the appropriate registers.
The scaling of R̂0 is dominated by the PREP operator. In fact, [61, 64] shows that is possible to implement
13
|·⟩a0
|·⟩q0 L̂ L̂
Ê Z̄ Ê
|·⟩q0 L̂
|·⟩q1 L̂
Ê Ê
|·⟩q1 D̂ D̂
|·⟩a1
Û † NEXT ITERATION
FIG. 8: Explicit representation of the R̂0 operator. The part included by the red-dashed line is applied only if the
ancilla measurement is |1⟩.
a multi-controlled X gate using only 12n + O(1) qubits and one dirty ancilla, where in the QFRANS scenario
n = 2q0 . Consequentially, even with an advantageous decomposition of the M CZ, the implementation f R̂0 remains
the bottleneck of the algorithm, as the depth and complexity scales in the same manner as PREP.
D. The measurement
The protocols described in Sec. IV B and Sec. IV C have to be repeated until we measure the state |0⟩a0 on the
ancilla qubit. When this happens, supposing an error-free quantum hardware, the system has collapsed to the target
state, an uniform superposition of the M solutions.
We propose to use the Bayes rule to estimate the value M representing the number of solution states. We treat M
as a random variable and assume a Poisson prior probability distribution
M ⟨M ⟩ e−M
p(M ) = , (29)
⟨M ⟩!
d 2
for the parameter M . The mean value ⟨M ⟩ = M M p(M ) is initialized as ⟨M ⟩ = 2h
P
L N pairs, corresponding
to the number obtained for an uniform spread of the particles in a d-dimensional space.
The probability of the QFRANS Qm−1algorithm of measuring the target state at the m-th iteration is given by the
probability p0 (m, M ) = pm (M ) i=1 (1 − pi (M )), with pi given by Eq. (12), which is a function of M itself. The
prior distribution (29) is updated by the Bayes rule as
p0 (m, M ) × p(M )
p(M |m) = P , (30)
M p0 (m, M )
Today’s quantum hardware is affected by noise and implementing any search algorithm in a scalable system is a
significant challenge [65, 66]. Only recently, a novel noise tolerant method has been developed to reduce the error
threshold for Grover’s search by optimizing the number of iterations [67].
In this section we employ a simple model in order to analyze the effect of the noise on the result of the QFRANS
algorithm assuming to have a quantum computer that can be characterized only by a bit-flip read-out error. This
assumption, though quite restrictive, can be used to characterize the noise resilience of the proposed algorithm.
Therefore, we consider that the state, after successful measurement of the ancilla qubit onto |1⟩, has collapsed into
the correct target state |ψT ⟩ and that noise affects the algorithm only in the measurement part. We will see that this
assumption does not change our proposed strategy which does not rely on error correction codes.
The bit-flip channel noise flips the state of a qubit with a probability 1 − p = error rate, where p is the probability
of getting the correct result. This error can affect the two registers that are actually measured, both with q0 qubits,
which give the labels corresponding to the two particles. Starting from the states |i⟩q0 , |j⟩q0 , the overall probability
of successfully measuring the exact indexes i and j is given by p2q0 . As a result, if we set a tolerance value TOL for
this probability, i.e. p2q0 ≥ TOL, the error rate must satisfy the following inequality:
1
error rate ≤ 1 − TOL 2q0 . (31)
Using reasonable values for the tolerance TOL = 0.99 and the number of qubits q0 = 30 to encode a large database of
N = 230 particles, the required error rate is in the order of 10−4 . Within this simplified model, the proposed algorithm
provides sufficiently accurate results within currently available error rates. In order to improve the robusteness of the
proposed algorithm to noise, post-measurement classical techniques can be employed. For instance, if the full reflection
operator is applied at the last iteration, measuring the distance register q1 state gives a direct way of performing error
detection, while this check requires only M operations.
One of the main bottleneck in N-body methods is the FRANS subroutine. This is due to cache misses that arise
when data that are close in space resides in distant memory locations. In this work we looked for a possible solution
using quantum computers.
Building on fixed point amplitude amplification we developed QFRANS, a quantum algorithm that finds all the
elements whose distance is less than a fixed radius ξ with the use of an ancilla qubit that signals if the quantum state
has reached the target. We developed an explicit encoding of the data that, by leveraging quantum superposition,
evaluate all the N 2 distances between particles. This allows to recover with O(M ) measures and a single quantum
circuit all the pairs of close neighbors, as opposed to classical FRANS algorithms that repeat the search process for
each particle with the risk of incurring in cache miss. One of the novelty of this work resides in the oracle, which is
efficiently implemented on a quantum circuit by using two comparators and has depth that scales linearly with the
number of qubits. We proposed two different versions of such oracle, that can be used in different scenarios. Our
algorithm is robust against bit-flip errors in the measurements process thanks to the encoding of data labels and
distances that allows an easy-to-implement error detection.
State preparation represents the major contribution to the circuit depth of our algorithm: for unstructured datasets,
current methods require either O(N ) circuit depth with p O(log N ) ancilla qubits, or O(log N ) depth using O(N ) an-
cilla qubits. The fixed-point algorithm necessitates N 2 /M circuit repetitions to achieve the target state, which,
when combined with the expectation of recovering all target states through O(M ) measurements, yields overall algo-
rithm scaling of O(M 1/2 N 2 ) depth with O(log N ) ancilla qubits, or O(M 1/2 N log N ) depth with O(N ) ancilla qubits,
depending on the encoding protocol. The scaling properties appear unfavorable compared to classical algorithms
in terms of computational complexity. However, we remark that classical FRANS implementations suffer from un-
predictable memory access patterns that trigger frequent cache misses, whereas QFRANS returns all neighbor pairs
through O(M ) single circuit execution. The current limitation of QFRANS lies in state preparation complexity, which
represents the most significant obstacle to practical implementation.
In future research more efficient state preparation algorithms will be investigated. Domain decomposition strategies
may offer computational advantages, though at the cost of requiring multiple circuit implementations. Additionally,
detailed studies of specific applications—such as cosmological simulations or molecular dynamics may be used to
validate the algorithm’s practical utility. Understanding error propagation and developing robust mitigation strategies
will be considered in future studies. QFRANS represents a promising foundation for quantum algorithms addressing
neighbor search problems. This work suggests that quantum computing may offer genuine advantages in computational
15
domains where classical methods face fundamental memory hierarchy limitations rather than purely algorithmic
constraints.
ACKNOWLEDGMENTS
We thank Simona Perotto, Filippo Marchetti, Stefano Borgani and Luigi Iapichino for valuable discussions. LC
and GM have benefited scientifically from the collaboration with the Spoke 10 INAF group. LC, GM, CS and
SS acknowledge financial support form the Italian National Center for HPC, Big Data and Quantum Computing
(CN00000013).
[1] V. Turau, Fixed-radius near neighbors search, Information Processing Letters 39, 201 (1991).
[2] L. Verlet, Computer ”Experiments” on Classical Fluids. I. Thermodynamical Properties of Lennard-Jones Molecules,
Physical Review 159, 98 (1967).
[3] R. A. Gingold and J. J. Monaghan, Smoothed particle hydrodynamics: theory and application to non-spherical
stars, Monthly Notices of the Royal Astronomical Society 181, 375 (1977), [Link]
pdf/181/3/375/3104055/[Link].
[4] V. Springel, N. Yoshida, and S. D. White, Gadget: a code for collisionless and gasdynamical cosmological simulations,
New Astronomy 6, 79 (2001).
[5] M. J. Abraham, T. Murtola, R. Schulz, S. Páll, J. C. Smith, B. Hess, and E. Lindahl, Gromacs: High performance molecular
simulations through multi-level parallelism from laptops to supercomputers, SoftwareX 1-2, 19 (2015).
[6] G. Di Ilio, D. Chiappini, S. Ubertini, G. Bella, and S. Succi, Fluid flow around naca 0012 airfoil at low-reynolds numbers
with hybrid lattice boltzmann method, Computers & Fluids 166, 200 (2018).
[7] Z. Yao, J.-S. Wang, G.-R. Liu, and M. Cheng, Improved neighbor list algorithm in molecular simulations using cell
decomposition and data sorting method, Computer Physics Communications 161, 27 (2004).
[8] J. Barnes and P. Hut, A hierarchical O(N log N) force-calculation algorithm, Nature 10.1038/324446a0 (1986).
[9] X. Chen and S. Güttel, Fast and exact fixed-radius neighbor search based on sorting (2024), arXiv:2212.07679 [cs].
[10] L. Wang, Y. Wu, Y. Deng, B. Kim, L. Pierce, G. Krilov, D. Lupyan, S. Robinson, M. K. Dahlgren, J. Greenwood, D. L.
Romero, C. Masse, J. L. Knight, T. Steinbrecher, T. Beuming, W. Damm, E. Harder, W. Sherman, M. Brewer, R. Wester,
M. Murcko, L. Frye, R. Farid, T. Lin, D. L. Mobley, W. L. Jorgensen, B. J. Berne, R. A. Friesner, and R. Abel, Accurate
and reliable prediction of relative ligand binding potency in prospective drug discovery by way of a modern free-energy
calculation protocol and force field, Journal of the American Chemical Society 137, 2695 (2015).
[11] Y. Shan, E. T. Kim, M. P. Eastwood, R. O. Dror, M. A. Seeliger, and D. E. Shaw, How does a drug molecule find its
target binding site?, Journal of the American Chemical Society 133, 9181 (2011).
[12] P. J. Bond and M. S. Sansom, Insertion and assembly of membrane proteins via simulation, Journal of the American
Chemical Society 10.1021/ja0569104 (2006).
[13] C. Liu, L. Xue, and C. Song, Calcium binding and permeation in trpv channels: Insights from molecular dynamics
simulations, Journal of General Physiology 155, e202213261 (2023), epub 2023 Sep 20.
[14] H. Zhao, P. Duan, Z. Li, Q. Chen, T. Yue, L. Zhang, V. Ganesan, and J. Liu, Unveiling the multiscale dynamics of polymer
vitrimers via molecular dynamics simulations, Macromolecules 56, 9336 (2023).
[15] M. P. Howard, J. A. Anderson, A. Nikoubashman, S. C. Glotzer, and A. Z. Panagiotopoulos, Efficient neighbor list cal-
culation for molecular simulation of colloidal systems using graphics processing units, Computer Physics Communications
203, 45 (2016).
[16] A. J. Proctor, T. J. Lipscomb, A. Zou, J. A. Anderson, and S. S. Cho, Performance Analyses of a Parallel Verlet Neigh-
bor List Algorithm for GPU-Optimized MD Simulations, in 2012 ASE/IEEE International Conference on BioMedical
Computing (BioMedCom) (2012) pp. 14–19.
[17] F. Chiti and C. M. Dobson, Protein misfolding, amyloid formation, and human disease: A summary of progress over the
last decade, Annual Review of Biochemistry 86, 27 (2017).
[18] K. Lindorff-Larsen, S. Piana, R. O. Dror, and D. E. Shaw, How fast-folding proteins fold, Science 334, 517 (2011).
[19] J. e. a. Abramson, Accurate structure prediction of biomolecular interactions with alphafold 3, Nature 630, 493 (2024).
[20] K. Schütze, M. Heinzinger, M. Steinegger, and B. Rost, Nearest neighbor search on embeddings rapidly identifies distant
protein relations, Frontiers in Bioinformatics 2, 1033775 (2022).
[21] M. van Kempen, S. S. Kim, C. Tumescheit, M. Mirdita, J. Lee, C. L. M. Gilchrist, J. Söding, and M. Steinegger, Fast and
accurate protein structure search with foldseek, Nature Biotechnology 42, 243 (2024).
[22] P. Goloubinoff, M. Sulpizi, A. Cappellaro, J. Dutheil, H. A. Nguyen, and P. De Los Rios, Chaperones convert the energy
from atp into the nonequilibrium stabilization of native proteins, Nature Chemical Biology 14, 388 (2018).
[23] S. Assenza, A. S. Sassi, R. Kellner, P. De Los Rios, and A. Barducci, Efficient conversion of chemical energy into mechanical
work by hsp70 chaperones, Physical Review X 9, 041033 (2019).
16
[24] P. De Los Rios and A. Barducci, Non-equilibrium protein folding and activation by atp-driven chaperones, Nature Reviews
Molecular Cell Biology 15, 611 (2014).
[25] E. M. Sontag, R. S. Samant, and J. Frydman, Mechanisms and functions of spatial protein quality control, Annual Review
of Biochemistry 86, 97 (2017).
[26] C. L. Klaips, G. G. Jayaraj, and F. U. Hartl, Pathways of cellular proteostasis in aging and disease, Journal of Cell Biology
217, 51 (2018).
[27] A. C. Crespo, J. M. Dominguez, A. Barreiro, M. Gómez-Gesteira, and B. D. Rogers, Gpus, a new tool of acceleration in
cfd: Efficiency and reliability on smoothed particle hydrodynamics methods, PLOS ONE 6, 1 (2011).
[28] T. L. Cassell, T. Deakin, A. Alpay, V. Heuveline, and G. B. Gadeschi, Efficient tree-based parallel algorithms for n-body
simulations using c++ standard parallelism, in SC24-W: Workshops of the International Conference for High Performance
Computing, Networking, Storage and Analysis (2024) pp. 708–717.
[29] supercomputer fugaku - fujitsu.
[30] T. Papatheodore, Summit architecture overview.
[31] J. I. Read and T. Hayfield, Sphs: smoothed particle hydrodynamics with a higher order dissipation switch,
Monthly Notices of the Royal Astronomical Society 422, 3037 (2012), [Link]
pdf/422/4/3037/18598311/[Link].
[32] M. Castelli, F. Marchetti, S. Osuna, A. S. F. Oliveira, A. J. Mulholland, S. A. Serapian, and G. Colombo, Decrypting Al-
lostery in Membrane-Bound K-Ras4B Using Complementary In Silico Approaches Based on Unbiased Molecular Dynamics
Simulations, Journal of the American Chemical Society 146, 901 (2024), publisher: American Chemical Society.
[33] D. Frenkel and B. Smit, Understanding Molecular Simulation: From Algorithms to Applications, 2nd ed. (Academic Press,
2002) Chap. 3.
[34] E. F. Combarro, I. F. Rúa, F. Orts, G. Ortega, A. M. Puertas, and E. M. Garzón, Quantum algorithms to compute the
neighbour list of N-body simulations, Quantum Information Processing 23, 61 (2024).
[35] M. Nielsen and I. Chuang, Quantum Computation and Quantum Information: 10th Anniversary Edition (Cambridge
University Press, 2010).
[36] Y. Cao, J. Romero, J. P. Olson, M. Degroote, P. D. Johnson, M. Kieferová, I. D. Kivlichan, T. Menke, B. Peropadre,
N. P. D. Sawaya, S. Sim, L. Veis, and A. Aspuru-Guzik, Quantum chemistry in the age of quantum computing, Chemical
Reviews 119, 10856 (2019).
[37] Y. e. a. Alexeev, Quantum-centric supercomputing for materials science: A perspective on challenges and future directions,
Future Generation Computer Systems 160, 666 (2024).
[38] L. Cappelli, F. Tacchino, G. Murante, S. Borgani, and I. Tavernelli, From vlasov-poisson to schrödinger-poisson: Dark
matter simulation with a quantum variational time evolution algorithm, Phys. Rev. Res. 6, 013282 (2024).
[39] S. Succi, W. Itani, K. Sreenivasan, and R. Steijl, Quantum computing for fluids: Where do we stand?, Europhysics Letters
144, 10001 (2023).
[40] L. K. Grover, A fast quantum mechanical algorithm for database search (1996), arXiv:quant-ph/9605043.
[41] L. K. Grover, Fixed-Point Quantum Search, Physical Review Letters 95, 150501 (2005).
[42] A. Mizel, Critically Damped Quantum Search, Physical Review Letters 102, 150501 (2009).
[43] T. J. Yoder, G. H. Low, and I. L. Chuang, Fixed-Point Quantum Search with an Optimal Number of Queries, Phys. Rev.
Lett. 113, 210501 (2014).
[44] D. W. Berry, A. M. Childs, R. Cleve, R. Kothari, and R. D. Somma, Exponential improvement in precision for simulating
sparse Hamiltonians, in Proceedings of the forty-sixth annual ACM symposium on Theory of computing (2014) pp. 283–292,
arXiv:1312.1414 [quant-ph].
[45] B. Yan, S. Wei, H. Jiang, H. Wang, Q. Duan, Z. Ma, and G.-L. Long, Fixed-point oblivious quantum amplitude-
amplification algorithm, Scientific Reports 12, 14339 (2022), publisher: Nature Publishing Group.
[46] A. A. Zecchi, C. Sanavio, S. Perotto, and S. Succi, Improved amplitude amplification strategies for the quantum simulation
of classical transport problems (2025), arXiv:2502.18283 [quant-ph].
[47] A. Gilyén, Y. Su, G. H. Low, and N. Wiebe, Quantum singular value transformation and beyond: exponential improvements
for quantum matrix arithmetics, in Proceedings of the 51st Annual ACM SIGACT Symposium on Theory of Computing
(ACM, Phoenix AZ USA, 2019) pp. 193–204.
[48] J. M. Martyn, Z. M. Rossi, A. K. Tan, and I. L. Chuang, Grand Unification of Quantum Algorithms, PRX Quantum 2,
040203 (2021).
[49] G. Brassard, P. Høyer, and A. Tapp, Quantum counting, in Automata, Languages and Programming: 25th International
Colloquium, ICALP’98 Aalborg, Denmark, July 13–17, 1998 Proceedings 25 (Springer, 1998) pp. 820–831.
[50] S. Lee and S. Y. Nam, Finding all solutions with grover’s algorithm by integrating estimation and discovery, Electronics
13, 10.3390/electronics13234830 (2024).
[51] E. Malvetti, R. Iten, and R. Colbeck, Quantum Circuits for Sparse Isometries, Quantum 5, 412 (2021), arXiv:2006.00016
[quant-ph].
[52] F. Mozafari, H. Riener, M. Soeken, and G. De Micheli, Efficient Boolean Methods for Preparing Uniform Quantum States,
IEEE Transactions on Quantum Engineering 2, 1 (2021).
[53] X.-M. Zhang, M.-H. Yung, and X. Yuan, Low-depth quantum state preparation, Physical Review Research 3, 043200
(2021), publisher: American Physical Society.
[54] X.-M. Zhang, T. Li, and X. Yuan, Quantum State Preparation with Optimal Circuit Depth: Implementations and Appli-
cations, Physical Review Letters 129, 230504 (2022), publisher: American Physical Society.
[55] X.-M. Zhang and X. Yuan, Circuit complexity of quantum access models for encoding classical data, npj Quantum Infor-
17