Gemini
A Rigorous Analysis of Query Processing Algorithms on
Arrays and Trees
Part I: Foundational Principles of Query Problem Design
The effective solution of query-based problems, a cornerstone of competitive
programming and algorithm design, hinges on a deep understanding of the interplay
between the problem's constraints and the capabilities of various data structures.
The choice of an algorithm is rarely arbitrary; it is a direct consequence of the
mathematical properties of the query function, the required order of operations, and
the nature of data modifications. This report provides a systematic analysis of five
pivotal algorithms—Segment Tree, Fenwick Tree, Mo's Algorithm, Square Root
Decomposition, and Heavy-Light Decomposition—across a spectrum of 30 canonical
query problems. Before delving into problem-specific applications, it is imperative to
establish the foundational principles that govern these choices. This section formally
defines the key algebraic properties and operational paradigms that form the
theoretical bedrock for all subsequent analysis.
1. A Lexicon of Query Function Properties
At the heart of any range query problem is a binary operation, ⊕, used to combine
elements or sub-solutions. The algebraic properties of this operation—or lack
thereof—are the most significant factor in determining algorithmic feasibility.
1.1. Associativity: The Grouping Property
Definition: A binary operation ⊕ on a set S is associative if, for all a,b,c∈S, the
following holds:
(a⊕b)⊕c=a⊕(b⊕c)
Significance: Associativity is the single most important prerequisite for the validity of
Segment Trees. A Segment Tree is a data structure built upon the
divide-and-conquer paradigm. A node representing an interval [l,r] stores the
aggregated result for that range, which is computed by combining the results from its
two children, representing [l,m] and [m+1,r] respectively, where m is the midpoint.
The computation result(parent) = result(left_child) ⊕ result(right_child)
implicitly groups the elements. Associativity guarantees that this grouping, and any
other grouping that the Segment Tree's query algorithm might produce by combining
different nodes, yields the same, correct result.
Without associativity, the entire hierarchical structure would be invalid, as the result
would depend on the specific way the range is partitioned. Most common range
query functions are associative, including:
● Sum: (a+b)+c=a+(b+c)
● Product: (a×b)×c=a×(b×c)
● Minimum/Maximum: min(min(a,b),c)=min(a,min(b,c))
● Greatest Common Divisor (GCD): gcd(gcd(a,b),c)=gcd(a,gcd(b,c))
● Bitwise AND, OR, XOR: (a∧b)∧c=a∧(b∧c)
● Matrix Multiplication: (A×B)×C=A×(B×C)
● String Concatenation: ("ab")+"c"="a"+("bc")
An example of a non-associative operation is the arithmetic mean: avg(avg(6, 10),
2) = avg(8, 2) = 5, whereas avg(6, avg(10, 2)) = avg(6, 6) = 6. A Segment
Tree cannot be used to directly compute range averages.
1.2. Commutativity: The Ordering Property
Definition: A binary operation ⊕ on a set S is commutative if, for all a,b∈S, the
following holds:
a⊕b=b⊕a
Significance: Commutativity dictates whether the order of operands affects the
result. Most of the associative operations listed above (sum, product, min, max,
GCD, bitwise operations) are also commutative. However, commutativity is not a
strict requirement for a standard Segment Tree. The structure naturally combines the
left child's result with the right child's result in a fixed order.
The property becomes more relevant in advanced scenarios or with different data
structures. For example, string concatenation and matrix multiplication are
associative but famously non-commutative. A Segment Tree can handle range
concatenation perfectly well, as it always processes sub-ranges in a consistent
left-to-right order.
1.3. Idempotency: The Overlap Property
Definition: A binary operation ⊕ on a set S is idempotent if, for all a∈S, the
following holds:
a⊕a=a
Significance: Idempotency is the key property that enables the remarkable O(1)
query time of a Sparse Table data structure. The operations min, max, GCD, bitwise
AND, and bitwise OR are idempotent. Operations like sum, product, and XOR are not.
A Sparse Table works by pre-calculating the answers for all ranges whose lengths
are powers of two. To answer a query for an arbitrary range [l,r], it finds the largest
power of two, 2
, such that 2
≤(r−l+1). It then combines the precomputed answers for two ranges: [l,l+2
k
−1] and [r−2
k
+1,r]. These two ranges are guaranteed to cover the entire query range [l,r], but
they may overlap in the middle.
If the operation is idempotent, this overlap is harmless. For instance, with min,
calculating the minimum over an element twice does not change the result:
min(min(S
1
),min(S
2
))=min(S
1
∪S
2
). However, for a non-idempotent operation like sum, this would lead to
double-counting the elements in the overlapping region, producing an incorrect
answer. This distinction perfectly illustrates why a Sparse Table is the optimal choice
for static Range Minimum Query (RMQ) but cannot be used for static Range Sum
Query with the same O(1) trick.
1.4. Invertibility: The Reversal Property
Definition: For a binary operation ⊕ with an identity element id, an element a has an
inverse a⁻¹ if a⊕a
−1
=a
−1
⊕a=id. In the context of range queries, a more practical definition of an invertible
operation is one that allows the result of a range [l,r] to be computed from prefix
results. That is, if P(i) is the result of A ⊕ A ⊕... ⊕ A[i], we can find query(l, r)
using only P(r) and P(l-1).
Significance: Invertibility is the defining requirement for the standard Fenwick Tree
(or Binary Indexed Tree, BIT). A Fenwick Tree is fundamentally optimized to perform
two operations in O(logN) time: updating a point and querying a prefix aggregate.
To compute an aggregate over a general range [l,r], it relies on the existence of an
inverse operation ⊖ such that:
query(l,r)=prefix_aggregate(r)⊖prefix_aggregate(l−1)
● For Range Sum, the operation is addition, and its inverse is subtraction.
sum(l, r) = prefix_sum(r) - prefix_sum(l-1).
● For Range XOR Sum, the operation is XOR, which is its own inverse.
xor_sum(l, r) = prefix_xor(r) ^ prefix_xor(l-1).
● For Range Product, the operation is multiplication, and its inverse is division
(if working in a field where division is well-defined, e.g., modulo a prime).
This requirement is the primary limitation of Fenwick Trees compared to Segment
Trees. Operations like min, max, and GCD do not have a general inverse in this sense.
Knowing min(A[0..r]) and min(A[0..l−1]) does not allow one to compute
min(A[l..r]). This is why a standard Fenwick Tree cannot solve RMQ or Range
GCD problems, making Segment Trees the more versatile choice for non-invertible
associative functions.
2. The Operational Paradigm: Constraints and Capabilities
Beyond the mathematical nature of the query function, the structure of the
problem—how queries arrive and how data changes—imposes further constraints
that guide algorithm selection.
2.1. Online vs. Offline Algorithms
Definition: An online algorithm must process its input and answer queries in the
order they are received, without any knowledge of future inputs or queries. A offline
algorithm is permitted to read the entire sequence of queries in advance before it
begins processing any of them. This allows the algorithm to reorder or batch queries
to optimize computation.
Significance: This is the most critical distinction for determining the applicability of
Mo's Algorithm.
● Mo's Algorithm is fundamentally an offline technique. Its efficiency is
derived entirely from its clever query sorting strategy, which groups queries by
blocks and then by their right endpoint to minimize the total movement of its
two pointers. This reordering is only possible if all queries are known
beforehand.
● Segment Trees and Fenwick Trees are inherently online. They are built to
handle any valid query or update at any time, making them suitable for
interactive problems where the next query may depend on the answer to the
previous one.
● Certain problems that are difficult to solve online can be solved efficiently
offline using a sweep-line approach, often paired with a data structure like a
Fenwick Tree. This also involves reordering queries (and other "events")
based on a coordinate, such as their right endpoint.
2.2. Static vs. Dynamic Problems
Definition: A static problem is one where the underlying dataset (e.g., the array)
does not change throughout the execution of all queries. A dynamic problem
involves updates to the dataset that are interspersed with queries.
Significance: This distinction governs the trade-off between preprocessing and
flexibility.
● For static problems, algorithms with heavy preprocessing but extremely fast
queries are often optimal. The canonical example is using a Sparse Table for
static RMQ, which requires O(NlogN) preprocessing but then answers
queries in O(1).
● For dynamic problems, the data structure must be able to efficiently
incorporate changes. Segment Trees and Fenwick Trees excel here,
typically handling point updates in O(logN) time. A Sparse Table is unsuitable
for dynamic problems because a single point update would require rebuilding
a significant portion of the table, an operation that is far too slow. Mo's
algorithm is designed for static problems, though an extension ("Mo's with
updates") exists to handle a limited number of point updates by adding a
"time" dimension, albeit with increased complexity.
2.3. Point vs. Range Operations and Lazy Propagation
Definition: An update operation can modify a single element (point update) or a
contiguous block of elements (range update). A query can likewise request
information about a single point or an entire range.
Significance: The combination of range updates and range queries introduces a
significant challenge. A naive range update on a Segment Tree or Fenwick Tree
would involve updating every element in the range individually, leading to a
complexity of at least O(N) per update, which negates the efficiency of the data
structure.
Lazy Propagation is the standard and powerful technique used with Segment Trees
to handle range updates efficiently. The core principle is to defer work. When an
update is applied to a range [l,r], instead of propagating the change all the way
down to the leaf nodes, the update information is stored as a "lazy tag" at the
highest-level nodes in the tree that are fully contained within [l,r]. This pending
update is only "pushed down" to the children of a node when a subsequent query or
update needs to access information within that node's subtree.
This amortization strategy reduces the complexity of both range updates and range
queries to O(logN). This technique is crucial for a wide class of problems and
represents a key advantage of Segment Trees. While some tricks exist to handle
range updates with Fenwick Trees, they are generally less flexible and cannot
handle the same breadth of operations (e.g., range set to value, range min/max
update) as a Segment Tree with lazy propagation can.
Part II: Comprehensive Analysis of Array-Based Query
Problems
This section provides a rigorous, problem-by-problem analysis of the applicability of
the five selected algorithms to a set of 15 array-based query challenges. Each
analysis is grounded in the foundational principles established in Part I.
1. Range Sum Query with Point Updates
Problem Definition and Properties:
● Statement: Given an array A[1..N], support two operations: update(i,
delta) which adds a value delta to A[i], and query(l, r) which returns the
sum of elements from index l to r, i.e., ∑
● k=l
● r
●
● A[k].
● Properties: This is a classic online, dynamic problem. The query function,
sum, is associative, commutative, and, most importantly, invertible (its inverse
is subtraction).
Algorithmic Applicability Analysis:
● Segment Tree: Applicable and Recommended.
● Applicability: The sum function is associative, which is the core
requirement for a Segment Tree. Each node in the tree stores the sum
of its corresponding range.
● Sketch: A point update update(i, delta) involves traversing from the
root to the leaf corresponding to index i, adding delta to the value of
each node along this path. This takes O(logN) time. A range query
query(l, r) performs a standard Segment Tree query, combining the
sums of O(logN) nodes that cover the range [l,r], also in O(logN)
time.
● Reasoning: The Segment Tree is a general-purpose tool for this
problem. While slightly more complex to implement and having a larger
constant factor than a Fenwick Tree, its generality makes it a robust
choice, especially if the problem might later be extended to include
non-invertible operations like range minimum.
● Fenwick Tree (BIT): Applicable and Highly Recommended.
● Applicability: The sum function is invertible, which is the core
requirement for a Fenwick Tree. A range sum can be calculated from
two prefix sums: sum(l,r)=prefix_sum(r)−prefix_sum(l−1).
● Sketch: The Fenwick Tree is maintained on the array values. An
update(i, delta) operation updates all prefix sums that include index
i in O(logN) time. A query(l, r) is answered by computing
prefix_sum(r) and prefix_sum(l-1), each in O(logN) time.
● Reasoning: For this specific problem, the Fenwick Tree is the ideal
solution. It is simpler to code, requires less memory (typically N vs 4N
for a Segment Tree), and is often faster in practice due to smaller
constant factors.
● Mo's Algorithm: Not Applicable.
● Applicability: The base form of Mo's algorithm is designed for static
arrays and offline queries.
● Reasoning: This problem is dynamic and online, with updates
interspersed with queries. While an extension called "Mo's with
updates" exists, it adds a time dimension and significant complexity,
making it far inferior to the logarithmic solutions offered by Segment
and Fenwick Trees for this fundamental problem.
● Square Root Decomposition: Applicable (but less efficient).
● Applicability: This technique can handle both point updates and range
queries.
● Sketch: The array is divided into
● N
●
●
● blocks of size
● N
●
●
● . An auxiliary array stores the sum of each block. A point update
update(i, delta) modifies the value in the original array and adds
delta to the sum of the corresponding block, taking O(1) time. A range
query query(l, r) sums elements naively in the partial blocks at the
boundaries of the range and adds the precomputed sums of the full
blocks in between. This takes O(
● N
●
●
● ) time.
● Reasoning: While functional, the O(
● N
●
●
● ) query time is significantly slower than the O(logN) time of tree-based
structures. It is generally not preferred unless implementation simplicity
is paramount or constraints are small.
● Heavy-Light Decomposition (HLD): Not Applicable.
● Reasoning: HLD is a technique for decomposing trees into paths to
answer queries on trees. It is not relevant for problems on a
one-dimensional array.
Recommended Approach and Strategic Insights:
● The Fenwick Tree is the most practical and efficient solution for this problem.
● The Segment Tree is also an excellent choice and offers more flexibility if the
problem requirements were to change (e.g., to find the range minimum). This
problem serves as the introductory example for both data structures,
highlighting the trade-off between the Fenwick Tree's simplicity and
specialization versus the Segment Tree's generality.
2. Range Minimum Query (Static and Dynamic variants)
Problem Definition and Properties:
● Statement: Given an array A, support query(l, r) to find the minimum
value in A[l..r]. A dynamic version also supports update(i, val).
● Properties: The min function is associative and idempotent. It is not
invertible. The problem exists in both static (no updates) and dynamic (online
point updates) forms.
Algorithmic Applicability Analysis:
● Segment Tree: Applicable (for both static and dynamic).
● Applicability: The min function is associative. Each node in the
Segment Tree can store the minimum of its range.
● Sketch: The structure is identical to that for Range Sum, but the merge
operation is min(left_child_value, right_child_value). Point
updates and range queries both run in O(logN) time.
● Reasoning: The Segment Tree is the standard and most efficient
solution for the dynamic version of RMQ.
● Fenwick Tree (BIT): Not Applicable.
● Reasoning: As established in Part I, the min function is not invertible.
A Fenwick Tree's prefix-based mechanism cannot be used to isolate
the minimum of an arbitrary range [l,r] from the prefix minimums of
[0,r] and [0,l−1].
● Mo's Algorithm: Poor Fit.
● Applicability: Mo's algorithm could theoretically be applied to the static
version, but it is highly inefficient.
● Reasoning: The algorithm's efficiency relies on fast (O(1) or O(logN))
addition and removal of elements from the active range. Maintaining
the minimum of a dynamic set of elements requires an auxiliary data
structure like a balanced binary search tree or a std::multiset in
C++, where each addition or removal takes O(logN) time. This
elevates the complexity of Mo's algorithm to a non-competitive
O((N+Q)
● N
●
●
● logN), making it far slower than other available methods.
● Square Root Decomposition: Applicable (but less efficient).
● Applicability: Can solve both static and dynamic versions.
● Sketch: The array is divided into blocks, and each block's minimum is
precomputed. A query takes O(
● N
●
●
● ) time by checking individual elements in partial blocks and
precomputed minimums in full blocks. A point update involves updating
the element and then re-calculating the minimum for its block, which
can take up to O(
● N
●
●
● ) time, or O(1) if the update logic is careful.
● Reasoning: Slower than a Segment Tree but simpler to conceptualize
and implement.
● Heavy-Light Decomposition (HLD): Not Applicable.
● Reasoning: This is an array-based problem.
Recommended Approach and Strategic Insights:
● Static RMQ: The optimal solution is a Sparse Table. Its O(NlogN)
preprocessing time is amortized over many queries, each of which takes only
O(1) time. This is possible precisely because the min function is idempotent,
allowing the use of two overlapping intervals to cover the query range without
error.
● Dynamic RMQ: The Segment Tree is the standard and most efficient
solution, providing O(logN) complexity for both queries and updates.
● This problem is the canonical example for illustrating the power of
idempotency. The existence of an O(1) query solution for the static case
(Sparse Table) which is unavailable for non-idempotent functions like sum is a
crucial lesson in algorithm selection.
3. Range GCD Query
Problem Definition and Properties:
● Statement: Given an array A, answer query(l, r) to find the greatest
common divisor (GCD) of all elements in A[l..r]. An optional dynamic version
includes point updates.
● Properties: The gcd function is associative and commutative. Like min, it is
also idempotent since gcd(x,x)=x. It is not invertible in the sense required by
a Fenwick Tree.
Algorithmic Applicability Analysis:
● Segment Tree: Applicable (for both static and dynamic).
● Applicability: gcd is associative.
● Sketch: Each node in the Segment Tree stores the GCD of its range.
The merge operation is node.value = gcd(left_child.value,
right_child.value). Point updates and range queries are handled in
the standard O(logN⋅logV) time, where V is the maximum value in
the array (due to the complexity of the gcd calculation itself, though
often treated as near-constant).
● Reasoning: This is the standard solution for the dynamic version of the
problem.
● Fenwick Tree (BIT): Not Applicable.
● Reasoning: The gcd function is not invertible. There is no "un-GCD"
operation that allows one to compute gcd(A[l..r]) from gcd(A[0..r])
and gcd(A[0..l−1]).
● Mo's Algorithm: Not Ideal.
● Reasoning: Similar to RMQ, maintaining the GCD of a dynamic set of
numbers as elements are added and removed is not an O(1)
operation. While the GCD of a multiset can be updated, it is not trivial
and would remove the efficiency advantage of Mo's algorithm.
● Square Root Decomposition: Possible (but less efficient).
● Applicability: Can be used for static or dynamic versions.
● Sketch: Precompute the GCD for each block. Queries take O(
● N
●
●
● ⋅logV) time. Updates take O(
● N
●
●
● ⋅logV) to rebuild the block's GCD.
● Reasoning: A valid but slower alternative to the Segment Tree.
● Heavy-Light Decomposition (HLD): Not Applicable.
● Reasoning: Array-based problem.
Recommended Approach and Strategic Insights:
● Static Range GCD: A Sparse Table is the optimal solution. Since gcd is
idempotent, the same O(1) query mechanism used for RMQ applies perfectly.
Preprocessing takes O(NlogN⋅logV).
● Dynamic Range GCD: The Segment Tree is the standard and most efficient
solution, with O(logN⋅logV) complexity for operations.
● This problem reinforces the concept of idempotency. Recognizing that gcd
shares this property with min and max immediately suggests that a Sparse
Table is the best approach for the static case.
4. Range XOR Query with Point Updates
Problem Definition and Properties:
● Statement: Given an array A, support update(i, val) (either setting A[i] to
val or XORing A[i] with val) and query(l, r) which returns the bitwise XOR
sum of elements from l to r.
● Properties: The XOR operation (^) is associative and commutative. Crucially,
it is its own inverse, since a∧a=0 (the identity element for XOR). This makes
it perfectly suited for prefix-based calculations.
Algorithmic Applicability Analysis:
● Segment Tree: Applicable.
● Applicability: XOR is associative.
● Sketch: Each node stores the XOR sum of its range. The merge
operation is node.value = left_child.value ^ right_child.value.
Both point updates and range queries take O(logN) time.
● Reasoning: A valid and general solution, but slightly overkill for this
specific problem.
● Fenwick Tree (BIT): Applicable and Highly Recommended.
● Applicability: XOR is an invertible operation. The XOR sum of a range
can be found using prefix XOR sums:
xor_sum(l,r)=prefix_xor(r)∧prefix_xor(l−1).
● Sketch: A Fenwick Tree is used where the "addition" operation is
replaced with XOR. update(i, delta) performs XOR operations up
the tree. query(l, r) is computed as prefix_xor(r) ^
prefix_xor(l-1).
● Reasoning: This is the most natural, efficient, and simplest solution.
The properties of XOR align perfectly with the mechanics of a Fenwick
Tree.
● Mo's Algorithm: Possible (but unnecessary).
● Applicability: For a static array, Mo's algorithm can work. The current
XOR sum can be maintained in O(1) when adding or removing an
element (current_xor ^= element).
● Reasoning: While it works, it is an offline algorithm with a much higher
time complexity (O((N+Q)
● N
●
●
● )) than the simple and fast online solutions provided by Fenwick and
Segment Trees.
● Square Root Decomposition: Applicable (but less efficient).
● Sketch: Maintain the XOR sum for each block. Point updates are
O(1), and range queries are O(
● N
●
●
● ).
● Reasoning: A valid but slower approach.
● Heavy-Light Decomposition (HLD): Not Applicable.
● Reasoning: Array-based problem.
Recommended Approach and Strategic Insights:
● The Fenwick Tree is the canonical and best solution for this problem,
showcasing a perfect match between an operation's properties and a data
structure's design.
● This problem, alongside Range Sum, serves as a primary example of where a
Fenwick Tree should be the first choice due to its efficiency and simplicity. It
highlights that any group operation (associative with an inverse) can be
handled by a Fenwick Tree.
5. Range Maximum Query with Range Updates
Problem Definition and Properties:
● Statement: Given an array A, support two operations: range_add(l, r,
delta) which adds delta to every element in A[l..r], and query_max(l, r)
which returns the maximum value in that range.
● Properties: This is a classic online, dynamic problem involving both range
updates and range queries. The max function is associative. This problem is
the canonical use case for lazy propagation.
Algorithmic Applicability Analysis:
● Segment Tree: Applicable (with Lazy Propagation).
● Applicability: This is the intended solution.
● Sketch: Each node in the Segment Tree stores the maximum value of
its range and a lazy variable representing a pending addition. When
range_add is called, the tree is traversed. For any node whose range is
fully contained within the update range, we update its lazy tag and its
max value in O(1) and stop descending. For nodes that partially
overlap, we first "push down" their lazy tag to their children before
recursing. This ensures updates are propagated correctly. Both
range_add and query_max operations take O(logN) time.
● Reasoning: Lazy propagation on a Segment Tree is designed
precisely for this scenario, efficiently handling updates on large ranges
by deferring the work.
● Fenwick Tree (BIT): Not Applicable (Directly).
● Reasoning: A Fenwick Tree can be adapted to handle range updates
and point queries, or point updates and range queries. However, it
cannot handle both range updates and range queries for a non-linear
function like max. The effect of adding delta to a range does not
translate into a simple, invertible change on the prefix maximums. For
example, max(A[l..r]) + delta is not necessarily equal to
max(A[l..r] + delta). The maximum element itself might change.
● Mo's Algorithm: Not Applicable.
● Reasoning: Mo's algorithm is for static arrays and cannot handle
range updates. The "Mo's with updates" variant is for point updates and
is already complex; adapting it for range updates would be
exceptionally difficult and inefficient.
● Square Root Decomposition: Applicable (with block-level lazy tags).
● Applicability: This provides a viable, if slower, alternative.
● Sketch: Each of the
● N
●
●
● blocks maintains a lazy tag for pending additions and the maximum
value of the original elements within that block. A range_add updates
the lazy tags for full blocks and applies the additions element-wise for
partial blocks. A query_max finds the maximum by considering
block_max + block_lazy_tag for full blocks and iterating through
elements (plus the lazy tag) in partial blocks. Both operations take O(
● N
●
●
● ) time.
● Reasoning: This is a good way to understand the concept of lazy
updates in a simpler, non-recursive context before tackling the
Segment Tree implementation.
● Heavy-Light Decomposition (HLD): Not Applicable.
● Reasoning: Array-based problem.
Recommended Approach and Strategic Insights:
● The Segment Tree with Lazy Propagation is the standard, most efficient,
and expected solution.
● This problem is fundamental for understanding why lazy propagation is a
necessary and powerful extension to the basic Segment Tree. It clearly
delineates the capabilities of Segment Trees from the more limited Fenwick
Trees when dealing with complex update-query combinations.
6. K-th Smallest in Range (Static Array)
Problem Definition and Properties:
● Statement: Given a static array A, answer offline queries of the form (l, r,
k): find the k-th smallest element in the subarray A[l..r].
● Properties: The array is static, and queries can be processed offline. The
query function is a non-trivial order statistic. It is not directly decomposable;
knowing the k-th smallest in two adjacent ranges does not easily yield the k-th
smallest in their union.
Algorithmic Applicability Analysis:
● Segment Tree: Applicable (as a Merge Sort Tree).
● Applicability: A standard Segment Tree is insufficient. However, a
variation called a Merge Sort Tree can solve this.
● Sketch: Each node in the Merge Sort Tree stores a sorted vector of all
elements in its corresponding range. The tree is built bottom-up,
merging the sorted vectors of children to form the parent's vector,
similar to the merge step in merge sort. A query for the k-th smallest in
[l,r] can be answered by binary searching on the answer. To check if a
value x is a potential answer, we count how many elements are less
than or equal to x in the range [l,r]. This count can be found in O(log
● 2
● N) time by querying the O(logN) relevant nodes in the tree and using
binary search (e.g., upper_bound) on each of their sorted vectors. The
overall query time is thus O(logV⋅log
● 2
● N), where V is the range of values, or more commonly, O(log
● 2
● N) per query if binary searching on the segment tree itself.
● Reasoning: This is a powerful, general online technique for such
queries, though it has a relatively high complexity and memory usage
(O(NlogN)).
● Fenwick Tree (BIT): Not Applicable (Directly).
● Reasoning: A standard Fenwick Tree cannot handle order statistic
queries. It would require significant augmentation, such as building a
BIT on a persistent segment tree or using it within another algorithm.
● Mo's Algorithm: Applicable.
● Applicability: The problem is static and offline, fitting Mo's
prerequisites.
● Sketch: We maintain the elements of the current range [l,r] in a data
structure that can efficiently find the k-th smallest element. A frequency
array (if values are small) or a hash map combined with a Fenwick Tree
or balanced BST over the values can work. As we add/remove
elements, we update this auxiliary structure. If we use a Fenwick Tree
on compressed values, each add/remove is O(logV), and finding the
k-th element involves a binary search on the BIT, also O(logV). The
total complexity is O((N+Q)
● N
●
●
● logV).
● Reasoning: This is a viable offline approach, especially if the range of
values is manageable.
● Square Root Decomposition: Possible (but complex).
● Sketch: Each block could store a sorted list of its elements. A query
would involve collecting all elements from the relevant blocks into a
temporary list and finding the k-th element. This is generally inefficient.
● Reasoning: Less practical than the Merge Sort Tree or Mo's algorithm.
● Heavy-Light Decomposition (HLD): Not Applicable.
● Reasoning: Array-based problem.
Recommended Approach and Strategic Insights:
● For online queries, a Merge Sort Tree or a Persistent Segment Tree (which
will be discussed later) are standard solutions.
● For offline queries, Mo's algorithm provides a different time-space trade-off
that can be effective. Another common offline approach involves binary
searching for the answer and using a BIT to answer "count elements ≤x"
queries, which takes O((N+Q)logNlogV).
● This problem marks a transition to more complex queries where simple
associative operations are not enough. It introduces the idea of augmenting
tree nodes with richer data structures (like sorted vectors) or using offline
processing to re-frame the problem.
7. Range Frequency Query
Problem Definition and Properties:
● Statement: Given a static array A, answer many queries of the form (l, r,
T): count the number of occurrences of value T in the subarray A[l..r].
● Properties: The array is static. Queries can be online or offline.
Algorithmic Applicability Analysis:
● Segment Tree: Applicable (as a Merge Sort Tree).
● Sketch: Using the same Merge Sort Tree from the previous problem,
each node stores a sorted vector of elements. To count occurrences of
[l,r], we query the O(logN) nodes covering the range. In each
T in
node's vector, we can count occurrences of T in O(logN) time using
binary search (upper_bound - lower_bound). The total query time is
O(log
● 2
● N).
● Reasoning: A general but potentially slow solution if the number of
queries is very high. A simpler approach for static data is to
precompute for each value T a sorted list of indices where it appears. A
query then becomes a binary search on this list, taking O(logN) time.
● Fenwick Tree (BIT): Not Applicable (Directly).
● Reasoning: A BIT cannot isolate counts of an arbitrary value T without
significant pre-processing for each possible value, which is often
infeasible.
● Mo's Algorithm: Applicable.
● Applicability: The problem is static and queries can be processed
offline.
● Sketch: We maintain a frequency map or array for the elements in the
current range [l,r]. As the pointers move, we update the frequencies in
O(1). A query for value T is then a simple O(1) lookup in our
frequency map: freq.
● Reasoning: This is a very natural and efficient application of Mo's
algorithm, with a total complexity of O((N+Q)
● N
●
●
● ).
● Square Root Decomposition: Applicable.
● Sketch: Divide the array into blocks. For each block, precompute a
frequency map of the elements it contains. A query (l, r, T) is
answered by iterating naively through the partial blocks at the
boundaries and summing up the precomputed frequencies from the full
blocks in between. A query takes O(
● N
●
●
● ) time.
● Reasoning: A simple and effective solution, especially if the alphabet
of values is small.
● Heavy-Light Decomposition (HLD): Not Applicable.
● Reasoning: Array-based problem.
Recommended Approach and Strategic Insights:
● If queries can be processed offline, Mo's algorithm is an excellent choice.
● If the data is static and queries are online, the most efficient approach is to
precompute a sorted list of indices for each value. A query (l, r, T) is
then answered by binary searching on the list for T to find the number of
indices between l and r. This takes O(logN) per query after O(N)
preprocessing.
● The Merge Sort Tree provides a more general but slower online solution.
8. Distinct Count in Range
Problem Definition and Properties:
● Statement: Given a static array A, for each query (l, r), find the number of
distinct values in A[l..r].
● Properties: The array is static. Queries can be answered offline. The "distinct
count" function is not associative or easily decomposable.
Algorithmic Applicability Analysis:
● Segment Tree: Not Applicable (in its basic form).
● Reasoning: Merging distinct counts is not a simple operation. A node
cannot store a single value that represents the distinct count in a way
that can be combined with its sibling. It would require storing a
representation of the set of elements (e.g., a hash set or bitmask),
making the merge operation expensive.
● Fenwick Tree (BIT): Applicable (with an offline trick).
● Applicability: This is a classic offline sweep-line algorithm.
● Sketch: The queries are sorted by their right endpoint, r. We process
the array from left to right with an index i. We maintain a BIT and an
array last_pos[value] that stores the last seen index of each value.
When we are at index i processing element A[i]:
1. If A[i] has been seen before at last_pos[A[i]], we perform
bit.add(last_pos[A[i]], -1) to "deactivate" its previous
occurrence.
2. We perform bit.add(i, 1) to "activate" the current occurrence.
3. We update last_pos[A[i]] = i. After processing index i, we
answer all queries that have r = i. The answer for a query (l,
r) is bit.query(l, r). This works because for any range [l,
r], only the rightmost occurrence of each distinct value will have
a 1 contributing to the sum in the BIT.
● Reasoning: This is a highly efficient offline solution with a complexity
of O((N+Q)logN).
● Mo's Algorithm: Applicable (Canonical Application).
● Applicability: This is one of the most famous benchmark problems for
Mo's algorithm.
● Sketch: We maintain a frequency array freq for elements in the
current range and a variable distinct_count. When adding an
element x to the range, we increment freq[x]. If freq[x] was
previously 0 and is now 1, we increment distinct_count. When
removing an element x, we decrement freq[x]. If freq[x] becomes 0,
we decrement distinct_count. Each add/remove step is O(1).
● Reasoning: A very direct and intuitive application of Mo's algorithm,
with complexity O((N+Q)
● N
●
●
● ).
● Square Root Decomposition: Possible but complex.
● Reasoning: Requires heavy precomputation and is less common and
clean than the other two primary solutions.
● Heavy-Light Decomposition (HLD): Not Applicable.
● Reasoning: Array-based problem.
Recommended Approach and Strategic Insights:
● Mo's Algorithm is the most straightforward and commonly implemented
solution.
● The Offline BIT with sweep-line is asymptotically faster and is a crucial
technique to master for offline problems.
● This problem is an excellent case study for comparing the two major
paradigms of offline processing: query reordering (Mo's) versus event-based
sweep-line (BIT).
9. Path Sum on Tree (Static / Dynamic)
This problem is identical to Problem 17 and will be analyzed in detail in Part III.
10. Subtree Sum / Update (Tree)
This problem is identical to Problem 16 and will be analyzed in detail in Part III.
11. Distinct Count in Range
This problem is identical to Problem 8 and was analyzed above.
12. Mode in Range
Problem Definition and Properties:
● Statement: For each query (l, r), find the count of the most frequent
element (the mode) in A[l..r]. If there are ties, any of their counts is
acceptable.
● Properties: The array is static, and queries can be offline. The mode function
is a classic example of a non-decomposable query. The mode of [l,m] and
the mode of [m+1,r] give almost no information about the mode of [l,r].
Algorithmic Applicability Analysis:
● Segment Tree: Not Applicable.
● Reasoning: The non-decomposable nature of the mode makes it
impossible to use a simple merge operation. A node would need to
store the full frequency map of its range to be able to merge with its
sibling, which is prohibitively expensive in both time and memory.
● Fenwick Tree (BIT): Not Applicable.
● Reasoning: Lacks the ability to track frequencies and is based on
invertible operations, which mode is not.
● Mo's Algorithm: Applicable (and one of the only feasible approaches).
● Applicability: The problem is static and offline.
● Sketch: This problem showcases the power of Mo's algorithm for
complex, non-algebraic queries. We maintain a frequency array
freq[value] for the current range. To efficiently track the mode, we
can also maintain a count of frequencies, count_of_freq[f], which
stores how many numbers appear f times. When adding an element x:
1. count_of_freq[freq[x]]--
2. freq[x]++
3. count_of_freq[freq[x]]++
4. Update the current maximum frequency if freq[x] is greater
than the current max. Removing an element is the reverse
process. This allows the mode's frequency to be tracked with
O(1) updates.
● Reasoning: Mo's algorithm is perfectly suited for this, as the state (the
full frequency distribution) can be updated incrementally. The
complexity is O((N+Q)
● N
●
●
● ).
● Square Root Decomposition: Possible (with heuristics or for
approximations).
● Reasoning: An exact online solution with Sqrt Decomposition is
difficult. It's possible to design solutions that precompute modes for all
blocks and handle queries by combining this with naive iteration, but
merging the information is non-trivial. It is generally not a practical
approach for the exact mode problem.
● Heavy-Light Decomposition (HLD): Not Applicable.
● Reasoning: Array-based problem.
Recommended Approach and Strategic Insights:
● Mo's Algorithm is the standard and most practical solution for the static
range mode problem.
● This problem is a quintessential example of why Mo's algorithm is so valuable.
It can tackle queries that have no nice algebraic structure, which fail
completely with divide-and-conquer data structures like Segment Trees. The
ability to maintain the "full state" of the range (its frequency map) is the key.
13. Range Majority Element
Problem Definition and Properties:
● Statement: For each query (l, r), find an element that appears more than
(r−l+1)/2 times in A[l..r], if one exists.
● Properties: The array is static. A key property is that if we combine two
ranges, and an element x is the majority in the union, it must have been the
majority in at least one of the sub-ranges. This is not strictly true, but a
candidate from one sub-range has a good chance of being the candidate for
the union. This "semi-decomposable" nature can be exploited.
Algorithmic Applicability Analysis:
● Segment Tree: Applicable (with a clever trick).
● Applicability: A standard Segment Tree cannot solve this, but one
augmented with Boyer-Moore Voting logic can.
● Sketch: Each node in the Segment Tree stores a pair: (candidate,
count). The candidate is a potential majority element for that range,
and count is its "margin" of victory. When merging two child nodes
(cand1, count1) and (cand2, count2):
● If cand1 == cand2, the new node is (cand1, count1 +
count2).
● If count1 > count2, the new node is (cand1, count1 -
count2).
● If count2 > count1, the new node is (cand2, count2 -
count1).
● If count1 == count2, there is no clear candidate (can be
represented as (null, 0)). This mimics the pairwise
cancellation of the Boyer-Moore algorithm. A query for range
[l,r] combines O(logN) nodes to find a single final candidate.
This candidate is the only possible majority element. The final
step is to verify if this candidate is truly a majority by counting its
actual occurrences in [l,r], which can be done efficiently (e.g.,
using precomputed index lists or another data structure). The
query takes O(logN) or O(log
● 2
● N) depending on the verification step.
● Reasoning: This is an elegant and efficient online solution that exploits
the specific properties of the majority query.
● Fenwick Tree (BIT): Not Applicable.
● Reasoning: The logic is non-linear and not invertible.
● Mo's Algorithm: Applicable (but overkill).
● Sketch: We can maintain frequencies as in the Range Mode problem.
After finding the mode, we check if its frequency exceeds the threshold.
● Reasoning: While it works, it is an offline solution and likely slower
than the specialized Segment Tree. The Segment Tree approach is
more insightful for this particular problem.
● Square Root Decomposition: Possible.
● Sketch: A query can find a candidate from each full block and the
partial blocks, then verify each of these few candidates.
● Reasoning: A valid O(
● N
●
●
● ) approach, but again, less efficient than the Segment Tree.
● Heavy-Light Decomposition (HLD): Not Applicable.
● Reasoning: Array-based problem.
Recommended Approach and Strategic Insights:
● The Segment Tree with Boyer-Moore Voting logic is the most elegant and
efficient online solution.
● This problem demonstrates that even some non-decomposable queries can
sometimes be solved with divide-and-conquer structures if a special property
(like the cancellation property of majority elements) can be exploited in the
merge step.
14. Range Median Query
Problem Definition and Properties:
● Statement: Given a static array A, for each query (l, r), find the median of
the elements in the sorted subarray A[l..r]. The median is the element at rank
⌊(length+1)/2⌋.
● Properties: This is equivalent to the Range K-th Smallest problem (Problem
6), where k=⌊(r−l+2)/2⌋. The array is static.
Algorithmic Applicability Analysis:
● Segment Tree: Applicable (as a Merge Sort Tree or Persistent Segment
Tree).
● Applicability: As with K-th Smallest, a specialized Segment Tree is
required.
● Sketch (Persistent Segment Tree): This is a standard and powerful
approach. We first coordinate compress the values in the array. Then,
we build a Persistent Segment Tree. We build N+1 versions of the tree.
The i-th version (or root) represents the frequency map of elements in
the prefix A[0..i−1]. The tree for prefix i is created from the tree for
prefix i-1 by updating the count for element A[i-1], which creates
only O(logV) new nodes. A query for the median in [l,r] is a k-th
smallest query on the range. This can be answered in O(logV) time by
using the roots for version r+1 and version l. We traverse down both
trees simultaneously; the difference in counts at each node tells us how
many elements in the range [l,r] fall into the left or right child's value
range, allowing us to find the k-th element.
● Reasoning: This is a highly efficient online solution with O(logV)
query time after O(NlogV) preprocessing and memory.
● Fenwick Tree (BIT): Applicable (with offline binary search).
● Applicability: This is a clever offline alternative.
● Sketch: Since all queries are known, we can binary search for the
answer (the median value) for all queries simultaneously (Parallel
Binary Search) or for each query individually. For a single query (l,
r), we binary search for the median value x. To check if x is a valid
guess, we need to count how many elements in [l,r] are less than or
equal to x. This subproblem ("count elements ≤x in a range") can be
solved efficiently offline with a sweep-line and a BIT. The total
complexity is roughly O(QlogVlogN).
● Reasoning: A good example of problem transformation, reducing an
order statistic query to multiple counting queries.
● Mo's Algorithm: Poor Fit.
● Reasoning: Maintaining the median of a dynamic set under O(1)
additions/removals is hard. It would require a data structure like two
balanced BSTs or two priority queues, making each step O(logN) and
the overall algorithm slow.
● Square Root Decomposition: Possible (but complex).
● Sketch: Each block could maintain a frequency array or sorted list. A
query would involve complex merging logic.
● Reasoning: Not a practical or efficient approach.
● Heavy-Light Decomposition (HLD): Not Applicable.
● Reasoning: Array-based problem.
Recommended Approach and Strategic Insights:
● The Persistent Segment Tree is the standard online solution for range k-th
element queries, and thus for range median.
● The offline binary search with a BIT is an important alternative technique to
have in one's toolkit.
● This problem highlights the power of persistence in data structures, allowing
range queries to be transformed into differences of prefix structures.
15. Range Inversion Count Query
Problem Definition and Properties:
● Statement: For each query (l, r), compute the number of pairs of indices
(i,j) such that l≤i<j≤r and A[i]>A[j].
● Properties: The array is static, and queries can be offline. The inversion
count is not decomposable.
Algorithmic Applicability Analysis:
● Segment Tree: Not Applicable (Directly).
● Reasoning: Merging inversion counts from two adjacent subarrays is
not enough. We also need to count inversions where one element is in
the left subarray and the other is in the right. This requires more
information than just the inversion counts and is similar to the merge
step of Merge Sort, making a simple Segment Tree merge inefficient.
● Fenwick Tree (BIT): Applicable (as part of Mo's or an offline sweep).
● Reasoning: A BIT by itself cannot solve this, but it is a crucial
component of the efficient solutions.
● Mo's Algorithm: Applicable (with a BIT).
● Applicability: This is a classic application combining Mo's algorithm
with another data structure. The problem is static and offline.
● Sketch: We use Mo's algorithm to manage the active range [l,r]. The
key is to efficiently calculate the change in the number of inversions
when adding or removing an element. We maintain a Fenwick Tree on
the (coordinate compressed) values of the elements currently in the
range.
● Adding element A[x] to the right: The number of new
inversions created is the number of elements currently in the
range that are greater than A[x]. This can be calculated with the
BIT as current_range_size - bit.query(A[x]).
● Removing element A[x] from the left: The number of
inversions lost is the number of elements currently in the range
(to its right) that are smaller than A[x]. This is bit.query(A[x]
- 1). Similar logic applies to adding to the left and removing
from the right. Each add/remove step takes O(logV) time due to
the BIT update.
● Reasoning: The total complexity is O((N+Q)
● N
●
●
● logV). This is a standard and powerful combination of techniques.
● Square Root Decomposition: Not Applicable (in a clean way).
● Reasoning: The logic for combining block results and partial block
results for inversion counts is extremely complex.
● Heavy-Light Decomposition (HLD): Not Applicable.
● Reasoning: Array-based problem.
Recommended Approach and Strategic Insights:
● Mo's Algorithm with a Fenwick Tree is the canonical offline solution for this
problem.
● This problem is an excellent example of how Mo's algorithm can be used as a
framework, with a secondary data structure (like a BIT) used to accelerate the
add/remove operations from O(N) to something more manageable like
O(logN).
Part III: Comprehensive Analysis of Tree-Based Query
Problems
This section transitions from linear arrays to tree structures. Here, the challenge lies
in handling queries on non-linear arrangements of nodes, such as paths and
subtrees. Techniques like Heavy-Light Decomposition and Euler Tour become
essential for mapping tree structures back to linear forms that can be managed by
array-based data structures.
16. Subtree Sum Query with Updates
Problem Definition and Properties:
● Statement: Given a rooted tree with values on its nodes, support two
operations: update(u, val) to change the value at node u, and
query_subtree(u) to find the sum of values for all nodes v in the subtree of u.
● Properties: This is a dynamic, online problem on a tree. The core challenge
is to represent subtrees in a way that allows for efficient range-like operations.
Algorithmic Applicability Analysis:
● Segment Tree / Fenwick Tree: Applicable (with Euler Tour).
● Applicability: These data structures are the engines for the solution,
but they require the tree to be "flattened" first.
● Sketch: We perform a Depth First Search (DFS) from the root to
compute an Euler Tour of the tree. During the DFS, we maintain a
global timer. When we first visit a node u, we record its entry time
tin[u]. After visiting all of u's children and returning, we record its exit
time tout[u]. This process has a crucial property: all nodes in the
subtree of u are visited between time tin[u] and tout[u]. Therefore,
the subtree of u corresponds to a contiguous range [tin[u],
tout[u]] in the Euler tour array. We create a new linear array where
the value of node u is placed at index tin[u].
● A query_subtree(u) becomes a range sum query on the range
[tin[u], tout[u]] in our new array.
● An update(u, val) becomes a point update at index tin[u].
This transformed problem is exactly Problem 1 (Range Sum with
Point Updates). We can use either a Segment Tree or a Fenwick
Tree to solve it in O(logN) time per operation after the initial
O(N) DFS traversal.
● Reasoning: This is the canonical and most efficient solution. It
beautifully reduces a tree problem to a solved array problem.
● Mo's Algorithm: Not Applicable.
● Reasoning: The problem is dynamic and online. Mo's is for static,
offline problems.
● Square Root Decomposition: Possible (on the flattened array).
● Reasoning: One could apply Sqrt Decomposition to the
Euler-tour-flattened array, resulting in O(
● N
●
●
● ) queries. This is valid but suboptimal compared to the logarithmic
solutions.
● Heavy-Light Decomposition (HLD): Applicable (but overkill).
● Applicability: HLD can solve this, as a subtree query is a special case
of a path query (paths from all subtree nodes to the subtree root).
● Reasoning: In HLD, the subtree of a node u also forms a contiguous
range in the linearized array. Therefore, the same logic applies.
However, the implementation of HLD is significantly more complex than
a simple Euler tour, making it an unnecessarily heavy tool for a
problem that only involves subtrees.
Recommended Approach and Strategic Insights:
● The Euler Tour + Fenwick Tree combination is the ideal solution due to its
efficiency (O(logN)) and relative implementation simplicity. A Segment Tree
is also perfectly acceptable.
● This problem is the primary illustration of the power of the Euler Tour
technique for converting subtree problems into range problems. This
transformation is a fundamental pattern in tree algorithms.
17. Path Sum Query with Updates
Problem Definition and Properties:
● Statement: Given a tree with values on its nodes, support update(u, val) (a
point update on a node) and query(u, v) (sum of values on the unique
simple path between nodes u and v).
● Properties: A dynamic, online problem on a tree. The sum function is
associative and invertible. The main difficulty is that a path between two
arbitrary nodes is not a contiguous segment in a simple traversal like a basic
Euler tour.
Algorithmic Applicability Analysis:
● Heavy-Light Decomposition (HLD): Applicable (Canonical Application).
● Applicability: This is the quintessential problem that HLD is designed
to solve.
● Sketch: HLD partitions the tree's edges into "heavy" and "light" edges.
A heavy edge connects a node to the child with the largest subtree
size. This process decomposes the entire tree into a set of disjoint
"heavy paths" or chains. A crucial property is that any path from the
root to a node crosses at most O(logN) light edges, and therefore
traverses at most O(logN) distinct heavy paths. The nodes are then
re-indexed in a new linear array such that each heavy path occupies a
contiguous block of indices. A Segment Tree or Fenwick Tree is built
on this array. To query the path sum between u and v:
● Find the Lowest Common Ancestor (LCA) of u and v.
● The path is broken into two parts: u to LCA and v to LCA.
● For each part (e.g., from u up to LCA), we ascend the tree.
While u is not in the same heavy path as the LCA, we query the
range from u to the head of its current chain, add it to the total,
and then jump u to the parent of the chain's head (crossing a
light edge).
● Once u and the LCA are on the same chain, we query the final
segment on that chain. This process decomposes the path u-v
into O(logN) separate range queries on the underlying data
structure.
● Reasoning: With a Segment Tree, each range query is O(logN), so
the total path query time is O(log
● 2
● N). A point update is a single update in the underlying array, taking
O(logN). This is the standard online solution.
● Segment Tree / Fenwick Tree: Applicable (as the underlying engine for
HLD).
● Reasoning: These structures are essential components of the HLD
solution. They operate on the linearized representation of the tree but
cannot solve the path problem on their own.
● Mo's Algorithm: Applicable (for static version only).
● Applicability: "Mo's on a tree" can solve the static version of this
problem offline.
● Sketch: The tree is flattened using an Euler tour that records both
entry (tin) and exit (tout) times for each node in an array. A path
query (u, v) (assuming tin[u] < tin[v]) is transformed into a range
query on the flattened array. Let lca be the LCA of u and v.
● If lca == u, the path is simply all nodes in the range [tin[u],
tin[v]] of the flattened tour array.
● If lca!= u, the path consists of nodes in the range [tout[u],
tin[v]], plus the node lca. The nodes that appear twice in this
range are not on the simple path and must be handled. A
common technique is to maintain a frequency/active status for
each node; when moving Mo's pointers, a node is toggled. The
query result is the sum of values of nodes that are currently
active.
● Reasoning: This reduces the tree problem to a standard (though
slightly more complex) Mo's application on an array. It cannot handle
the dynamic updates required by the full problem.
● Square Root Decomposition: Not Applicable.
● Reasoning: Not well-suited for decomposing arbitrary tree paths.
Recommended Approach and Strategic Insights:
● For the dynamic online problem, HLD combined with a Segment Tree or
Fenwick Tree is the standard and most efficient solution.
● This problem highlights the power of HLD as a structural decomposition
technique. It transforms a complex path query into a small number of simple
range queries, bridging the gap between tree-based problems and
array-based data structures.
18. Subtree Minimum / Maximum Query
Problem Definition and Properties:
● Statement: Given a tree with values on its nodes, support update(u, val)
and query_subtree(u) to find the minimum or maximum value in the subtree
of u.
● Properties: This is a dynamic, online problem. The min and max functions are
associative and idempotent, but not invertible.
Algorithmic Applicability Analysis:
● Segment Tree: Applicable (with Euler Tour).
● Applicability: This is the canonical solution.
● Sketch: We use the same Euler Tour (tree flattening) technique as in
Problem 16. The subtree of u maps to a contiguous range [tin[u],
tout[u]]. The problem is transformed into Dynamic RMQ (Problem 2)
on the flattened array. A Segment Tree is built on this array to handle
point updates and range min/max queries in O(logN) time.
● Reasoning: A direct and efficient reduction of a tree problem to a
standard array problem.
● Fenwick Tree (BIT): Not Applicable.
● Reasoning: Even after flattening the tree, the underlying problem
becomes Range Minimum/Maximum Query. As established, a standard
Fenwick Tree cannot solve this because min and max are not invertible.
● Mo's Algorithm: Not Applicable.
● Reasoning: The problem is dynamic and online.
● Square Root Decomposition: Possible (on the flattened array).
● Reasoning: Sqrt Decomposition could be applied to the flattened array
for an O(
● N
●
●
● ) solution, but this is less efficient than the Segment Tree approach.
● Heavy-Light Decomposition (HLD): Applicable (but overkill).
● Reasoning: As with subtree sum, HLD can solve this because a
node's subtree forms a contiguous block in its linearized array.
However, a simple Euler Tour is sufficient and much easier to
implement.
Recommended Approach and Strategic Insights:
● The Euler Tour + Segment Tree combination is the standard and most
efficient solution.
● This problem reinforces the utility of the Euler Tour for any associative query
function on subtrees. The only thing that changes from Problem 16 is the
operation stored in the Segment Tree nodes (from sum to min or max).
19. Path Maximum Query with Range Updates
Problem Definition and Properties:
● Statement: Given a tree, support add(u, v, delta) (add delta to all nodes
on the path from u to v) and query_max(u, v) (find the maximum value on
the path from u to v).
● Properties: This is the most complex problem type so far, combining a tree
structure, path queries, and range updates. It is dynamic and online.
Algorithmic Applicability Analysis:
● Heavy-Light Decomposition (HLD): Applicable (with a Lazy Segment
Tree).
● Applicability: This is the canonical and essentially only feasible
solution.
● Sketch: This problem requires combining two powerful techniques:
HLD to handle the path structure and a Lazy Segment Tree to handle
the range updates.
1. The tree is decomposed using HLD, and a Segment Tree is built
on the linearized array. This Segment Tree must support lazy
propagation.
2. A query_max(u, v) operation works just like in Problem 17: the
path is decomposed into O(logN) segments on heavy chains,
and we perform O(logN) range max queries on the Lazy
Segment Tree. Each of these takes O(logN), for a total of
O(log
3. 2
4. N).
5. An add(u, v, delta) operation is similar: the path is
decomposed into O(logN) segments, and we perform a
range_add update on the Lazy Segment Tree for each segment.
This also takes O(log
6. 2
7. N).
● Reasoning: This solution is a direct composition of the solutions for
path queries (HLD) and range updates/range queries (Lazy Segment
Tree).
● Segment Tree / Fenwick Tree: Not Applicable (alone).
● Reasoning: They are necessary components of the HLD solution but
cannot handle the path structure on their own. An Euler tour does not
map arbitrary paths to contiguous segments, so it cannot be used here.
● Mo's Algorithm / Sqrt Decomposition: Not Applicable.
● Reasoning: These methods are not designed for dynamic range
updates on a tree structure.
Recommended Approach and Strategic Insights:
● The only standard, efficient solution is Heavy-Light Decomposition
combined with a Lazy Segment Tree.
● This problem represents a capstone of sorts for the techniques discussed. It
requires understanding tree decomposition (HLD), array-based range queries
(Segment Tree), and efficient dynamic updates (Lazy Propagation) and
composing them into a single, cohesive solution. Successfully solving this
problem indicates a mastery of all three concepts.
20. Path XOR Query
Problem Definition and Properties:
● Statement: Given a tree with values on its nodes, answer queries for the
XOR sum of values on the path between u and v. The tree is often static in
this variant.
● Properties: The XOR function is associative and its own inverse.
Algorithmic Applicability Analysis:
● Heavy-Light Decomposition (HLD): Applicable.
● Sketch: This works exactly like Path Sum Query (Problem 17). We use
HLD to decompose paths into O(logN) segments and use a Segment
Tree (or Fenwick Tree, since XOR is invertible) on the linearized array
to find the XOR sum of each segment. The total query time is O(log
● 2
● N).
● Reasoning: A general and powerful method that also works for the
dynamic version with point updates.
● Segment Tree / Fenwick Tree: Applicable (as part of HLD or with prefix
XOR).
● Reasoning: These are used as the underlying data structure for HLD.
An alternative exists for the static case.
● Mo's Algorithm: Applicable (for static version).
● Sketch: "Mo's on a tree" can be used just as in Problem 17. The
add/remove operation for XOR is O(1), so the total complexity is
O((N+Q)
● N
●
●
● ).
● Reasoning: A viable offline solution if updates are not required.
● Square Root Decomposition: Not Applicable.
Alternative Approach for Static Case: Prefix XOR with LCA
● Applicability: For a static tree, there is a very simple and efficient O(logN)
online solution.
● Sketch:
1. Precompute the depth of each node and the LCA structure (e.g., using
binary lifting).
2. During the same DFS, compute prefix_xor[u], which is the XOR sum
of values on the path from the root to u.
3. The XOR sum of the path between u and v is given by: $$
\text{path_xor}(u, v) = \text{prefix_xor}[u] \wedge \text{prefix_xor}[v]
\wedge \text{value}[\text{lca}(u, v)] $$ The first two terms,
prefix_xor[u] ^ prefix_xor[v], give the XOR sum of the paths
from u to the root and v to the root. This double-counts the path from
the LCA to the root, and misses the LCA itself. XORing these two paths
cancels out the common path from the root to the parent of the LCA,
leaving the path from u to v but excluding the LCA. We XOR with the
LCA's value to include it. A small correction is needed: the common
path from root to LCA is XORed twice, effectively cancelling it out. The
formula should be: $$ \text{path_xor}(u, v) = \text{prefix_xor}[u] \wedge
\text{prefix_xor}[v] \wedge \text{value}[\text{lca}(u,v)] \wedge
\text{value}[\text{lca}(u,v)] \wedge
\text{value}[\text{parent}(\text{lca}(u,v))] \wedge
\text{value}[\text{parent}(\text{lca}(u,v))]... $$ A simpler, correct
formulation is that prefix_xor[u] ^ prefix_xor[v] correctly
computes the XOR sum of the symmetric difference of the two paths
from the root. This symmetric difference is exactly the path from u to v,
but excluding the LCA. So, the correct formula is: $$ \text{path_xor}(u,
v) = \text{prefix_xor}[u] \wedge \text{prefix_xor}[v] \wedge
\text{value}[\text{lca}(u, v)] $$ Actually, prefix_xor[u] ^
prefix_xor[v] includes every node on the path from u to lca
(exclusive of lca) and v to lca (exclusive of lca) exactly once. The
nodes on the path from lca to the root are included twice and thus
cancel out. This leaves the path from u to v, but with the LCA counted
twice if we define prefix_xor inclusively. Let's redefine dist(u) as the
XOR sum from root to u. The path from u to v is the path from u to lca
and from v to lca. The XOR sum is (dist(u) ^ dist(lca)) ^
(dist(v) ^ dist(lca)) ^ val(lca) = dist(u) ^ dist(v) ^
val(lca).
● Reasoning: This is extremely efficient for the static case, requiring only LCA
lookup (O(logN)) per query.
Recommended Approach and Strategic Insights:
● For the static case, the Prefix XOR with LCA method is by far the best,
offering O(logN) queries after O(NlogN) preprocessing.
● For the dynamic case (with point updates), HLD + Fenwick Tree is the
standard solution, offering O(log
● 2
● N) queries and O(logN) updates.
● This problem illustrates how a specific, invertible property (like XOR) can
sometimes unlock a much simpler ad-hoc solution (the prefix method) that
outperforms general-purpose machinery like HLD, but only in the static case.
21. Count Nodes > T on Path
Problem Definition and Properties:
● Statement: Given a tree with node values, answer queries (u, v, T): find
the number of nodes x on the path between u and v such that value[x] > T.
● Properties: This is a threshold query on a tree path. It can be static or
dynamic. For the static version, queries can be processed offline.
Algorithmic Applicability Analysis:
● Heavy-Light Decomposition (HLD): Applicable (with modifications).
● Online Dynamic Sketch: For the online version with updates, this is
very hard. A simple Segment Tree is not enough, because T varies per
query. Each node of the Segment Tree on the HLD array would need to
contain a data structure that can answer "count elements > T" quickly
(like a balanced BST or Fenwick tree). This leads to a complex solution
with O(log
● 2
● N⋅logV) or higher complexity.
● Offline Static Sketch: An offline approach is much more practical. We
can use a sweep-line algorithm.
1. Collect all distinct node values and all query thresholds T.
2. Create "events": one for each node (value, node_id) and one for
each query (threshold, u, v, query_id).
3. Sort all events in descending order of value/threshold.
4. Process the events. Maintain a Fenwick Tree (or Segment Tree)
on the HLD-linearized node positions, initialized to all zeros.
5. When an event is a node (val, id), "activate" it by setting its
position in the BIT to 1.
6. When an event is a query (T, u, v, id), all nodes with values
greater than T have already been activated. The answer is
simply a path sum query on the BIT using HLD. This path sum
will count the number of activated nodes on the path.
● Reasoning: The offline sweep-line approach transforms the problem
into a series of path sum queries, which we know how to solve. The
total complexity is dominated by sorting and the queries, leading to
O((N+Q)log(N+Q)+Qlog
● 2
● N).
● Mo's Algorithm: Applicable (for static version).
● Sketch: We can use Mo's on a tree. For each query (u, v, T), we
would use the standard tree flattening. The "add/remove" logic would
update a frequency map. The answer would be the sum of frequencies
for all values greater than T. This is slow. A better way is to combine
Mo's with the offline sweep idea: sort queries by T, and use Mo's
algorithm, activating nodes as the threshold T decreases. This is
getting complicated. The simpler HLD sweep is preferable.
● Reasoning: While possible, it's less natural than the HLD sweep-line
method.
Recommended Approach and Strategic Insights:
● For the static offline version, the HLD + BIT sweep-line approach is
standard and efficient.
● This problem is a great example of the power of offline processing and the
sweep-line paradigm, especially when combined with a structural
decomposition like HLD. It shows how to handle an extra query parameter (T)
by converting it into the sweep dimension.
22. K-th Ancestor Queries
Problem Definition and Properties:
● Statement: Given a rooted tree and queries (u, k), find the k-th ancestor of
node u. The tree is static.
● Properties: This is a fundamental tree navigation problem, not an aggregate
query.
Algorithmic Applicability Analysis:
● Segment Tree / Fenwick Tree / Mo's / Sqrt Decomposition: Not
Applicable.
● Reasoning: These are data structures for range aggregation, not for
navigating tree ancestry.
● Heavy-Light Decomposition (HLD): Possible (but not ideal).
● Sketch: One can answer k-th ancestor queries by repeatedly jumping
to the top of the current heavy path and then to its parent, keeping
track of the distance traveled. This is more complex than the standard
method.
● Reasoning: HLD is designed for path aggregate queries, not simple
ancestor jumps. It's overkill and less efficient.
Canonical Approach: Binary Lifting
● Applicability: This is the standard and most efficient technique.
● Sketch:
1. Preprocessing: Perform a DFS from the root. For each node u,
precompute up[u][j], the 2
2. j
3. -th ancestor of u. This can be done with dynamic programming: up[u]
is the parent of u, and up[u][j] = up[ up[u][j-1][j-1]. This
preprocessing takes O(NlogN) time and space.
4. Querying: To find the k-th ancestor of u, we express k in its binary
representation. For each bit j that is set in k, we jump u up by 2
5. j
6. steps: u = up[u][j]. This performs the total jump of k steps.
● Reasoning: Each query takes O(logN) time, which is highly efficient.
Recommended Approach and Strategic Insights:
● Binary Lifting is the definitive solution for k-th ancestor queries.
● This problem serves to highlight that not all tree queries are about
aggregation. Navigational queries often require specialized preprocessing
techniques like binary lifting, which focus on the pointer-like structure of the
tree itself.
23. K-th Node on Path
Problem Definition and Properties:
● Statement: For a query (u, v, k), find the k-th node on the simple path from
u to v (where the 1st node is u).
● Properties: A static tree navigation problem.
Algorithmic Applicability Analysis:
● Segment Tree / Fenwick Tree / Mo's / Sqrt Decomposition: Not
Applicable.
● Reasoning: These are not navigational tools.
● Heavy-Light Decomposition (HLD): Applicable.
● Sketch: We can decompose the path from u to v into HLD segments.
By tracking the lengths of these segments, we can determine which
segment the k-th node lies in and then find its exact position within that
segment. This is feasible but can be complex to implement correctly.
● Reasoning: A valid but often more complex alternative to binary lifting.
● Binary Lifting + LCA: Applicable (Canonical Method).
● Applicability: This is the standard and simplest solution.
● Sketch:
1. Preprocess for LCA and k-th ancestor using binary lifting
(O(NlogN)).
2. For a query (u, v, k), first find lca = LCA(u, v).
3. Calculate the length of the path from u to lca, which is depth[u]
- depth[lca]. Let this be len_u.
4. If k <= len_u + 1 (using 1-based k), the target node is on the
path from u to lca. The answer is the (k-1)-th ancestor of u.
5. Otherwise, the target node is on the path from lca to v. The total
path length is len_u + depth[v] - depth[lca]. The target
node is k - (len_u)-th node on the path from lca to v. This is
equivalent to finding the (depth[v] - depth[lca] - (k -
len_u - 1))-th ancestor of v.
● Reasoning: Each query is broken down into an LCA query and one or
two k-th ancestor queries, all of which take O(logN) time.
Recommended Approach and Strategic Insights:
● Binary Lifting combined with LCA is the most straightforward and efficient
method.
● This problem shows how the binary lifting framework can be extended beyond
simple ancestor queries to solve more complex path navigation tasks.
24. Subtree Distinct Count
Problem Definition and Properties:
● Statement: For queries (u), return the number of distinct values in the
subtree of node u. The tree and its values are static. Queries can be
answered offline.
● Properties: A static, offlineable query on subtrees. The distinct count function
is not decomposable.
Algorithmic Applicability Analysis:
● Segment Tree / Fenwick Tree: Applicable (with offline tricks).
● Sketch: First, flatten the tree with an Euler tour so that the subtree of u
corresponds to the range [tin[u], tout[u]]. The problem is now
identical to Range Distinct Count on an array (Problem 8). We can use
the offline sweep-line with a BIT approach on the flattened array to
answer all queries in O((N+Q)logN).
● Reasoning: A standard reduction from a tree problem to an array
problem, followed by a standard offline algorithm.
● Mo's Algorithm: Applicable (on the Euler Tour).
● Sketch: Flatten the tree using the Euler tour (tin/tout ranges). The
problem becomes Range Distinct Count on the flattened array. We can
apply Mo's algorithm directly to these ranges.
● Reasoning: A direct application of Mo's after tree flattening.
Complexity is O((N+Q)
● N
●
●
● ).
● DSU on Tree (Sack): Applicable and Optimal for offline queries.
● Applicability: This is a specialized technique, also known as
"small-to-large merging," designed for exactly this type of offline
subtree query.
● Sketch: We perform a DFS traversal. For each node u, we recursively
solve the problem for all its children. To compute the answer for u, we
need to merge the sets of values from its children's subtrees. The
"small-to-large" trick is to identify the "heavy" child (the one with the
largest subtree). We pass the frequency map (or set) of the heavy child
up to u by reference (an O(1) pointer swap). Then, for all other "light"
children, we iterate through their nodes and merge their values into the
heavy child's map. Finally, we add u itself. The size of the map is the
answer for u.
● Reasoning: The amortized analysis shows that each node is part of a
"light" merge at most O(logN) times. This leads to a total time
complexity of O(NlogN) to answer the queries for all subtrees
simultaneously. It is often faster than Mo's algorithm.
● Heavy-Light Decomposition (HLD): Not Applicable.
● Reasoning: HLD is for path queries. Merging distinct sets across
segments of a path is inefficient.
Recommended Approach and Strategic Insights:
● For answering queries for all subtrees offline, DSU on Tree is the most
efficient solution with O(NlogN) complexity.
● Mo's on the Euler Tour is a more general-purpose offline tool that also works
well.
● This problem introduces DSU on Tree as a powerful, specialized alternative to
more general offline methods when dealing with subtree aggregation
problems.
25. Path Distinct Count
Problem Definition and Properties:
● Statement: For each query (u, v), return the number of distinct values on
the simple path between u and v. The tree is static.
● Properties: A static, offlineable query on paths.
Algorithmic Applicability Analysis:
● Mo's Algorithm: Applicable (on the Euler Tour, path variant).
● Applicability: This is the standard application of "Mo's on a tree" for
path queries.
● Sketch: The tree is flattened using an Euler tour that records both
entry (tin) and exit (tout) times. A path query (u, v) is transformed
into a range query on this flattened array. Let lca = LCA(u, v).
Assuming tin[u] < tin[v]:
● If lca == u, the path corresponds to nodes that appear exactly
once in the flattened array range [tin[u], tin[v]].
● If lca!= u, the path corresponds to nodes that appear exactly
once in the range [tout[u], tin[v]], plus the LCA node itself,
which is handled separately. We can then apply Mo's algorithm
to these ranges. A special add/remove function is used that
toggles the "active" status of a node. A node is added to the
distinct count if it becomes active an odd number of times.
● Reasoning: This is a non-trivial but standard reduction that allows Mo's
algorithm to solve path queries. Complexity is O((N+Q)
● N
●
●
● ).
● Heavy-Light Decomposition (HLD): Not Applicable (Efficiently).
● Reasoning: Decomposing the path into O(logN) segments is
possible, but merging distinct counts from these segments is not an
O(1) operation. It would require merging sets, making it slow.
● DSU on Tree: Not Applicable.
● Reasoning: This technique is designed for subtree queries, not
arbitrary path queries.
Recommended Approach and Strategic Insights:
● Mo's Algorithm on a Tree is the standard offline solution for this problem.
● This problem demonstrates the full power and complexity of adapting Mo's
algorithm to tree structures. The transformation from a path query to a range
query on the Euler tour array is a key technique that enables the application of
this powerful offline algorithm to a new class of problems.
Part IV: Comprehensive Analysis of Advanced and Specialized
Scenarios
This final section addresses a set of advanced problems that often require a
synthesis of multiple techniques or involve more specialized algorithmic paradigms.
These problems represent the upper echelon of difficulty in typical query-processing
contexts.
26. Offline Dynamic Connectivity
Problem Definition and Properties:
● Statement: Given a graph (or tree) and a sequence of edge additions and
removals, answer queries of the form (u, v) asking if nodes u and v are
connected at a specific time (i.e., after a certain number of edge
modifications). All operations are provided offline.
● Properties: The problem is dynamic, but crucially, all operations are known in
advance (offline). The query is about connectivity, which is naturally handled
by a Disjoint Set Union (DSU) data structure. The challenge is handling edge
removals, as a standard DSU with path compression does not easily support
rollbacks.
Algorithmic Applicability Analysis:
● Segment Tree / Fenwick Tree: Applicable (as a framework over time).
● Applicability: These structures are not used for the connectivity logic
itself, but to structure the problem over the time dimension.
● Sketch: This is the canonical "offline dynamic connectivity" solution,
often called a Segment Tree over Time.
1. We consider the timeline of queries from 1 to Q. This timeline
becomes the base array for a Segment Tree.
2. For each edge, we determine its "lifetime" intervals. An edge
added at time t
3. 1
4.
5. and removed at time t
6. 2
7.
8. exists during the interval [t
9. 1
10.
11.,t
12.2
13.
14.−1].
15.We "add" each edge to the Segment Tree. An edge with lifetime
[t
16.1
17.
18.,t
19.2
20.
21.−1] is inserted into the O(logQ) nodes of the Segment Tree
that cover this time interval.
22.We use a DSU structure that supports rollbacks (i.e., one
without path compression, only union by size/rank). We perform
a DFS traversal on the Segment Tree.
23.When entering a node in the DFS, we perform the union
operations for all edges stored at that node.
24.We recurse into the children.
25.When returning from a node's DFS, we undo the union
operations performed at that node, restoring the DSU to its
previous state.
26.When we reach a leaf of the Segment Tree, which corresponds
to a single point in time, we answer all connectivity queries for
that time using the current state of the DSU.
● Reasoning: This elegant technique transforms a dynamic problem with
deletions into a series of semi-static problems at different levels of a
recursion. The complexity is roughly O((N+M+Q)logQ⋅α(N)), where
M is the number of edge updates.
● Mo's Algorithm: Not Applicable (in its standard form).
● Reasoning: Mo's algorithm is designed for array-like range queries,
not graph connectivity problems.
● Sqrt Decomposition / HLD: Not Applicable.
● Reasoning: These are not relevant to this type of dynamic graph
problem.
Recommended Approach and Strategic Insights:
● The Segment Tree over Time with a Rollback DSU is the standard and
definitive solution for offline dynamic connectivity.
● This problem introduces a powerful meta-technique: using a data structure
like a Segment Tree not on spatial data, but on the dimension of time. This
allows for a divide-and-conquer approach to problems with additions and
deletions by converting lifetimes into intervals.
27. Range Add, Range Sum with Offline Queries
Problem Definition and Properties:
● Statement: An array supports range_add(l, r, delta) operations. After all
updates are performed, answer a set of range_sum(l, r) queries. All
updates and queries are provided offline.
● Properties: Dynamic updates, but the offline nature allows for reordering and
batch processing.
Algorithmic Applicability Analysis:
● Segment Tree: Applicable (with Lazy Propagation).
● Reasoning: A Lazy Segment Tree can solve this problem online, so it
can certainly solve the offline version. It provides O(logN) for both
updates and queries. However, the offline constraint allows for a
simpler solution.
● Fenwick Tree (BIT): Applicable (with offline prefix trick).
● Applicability: The offline nature allows us to transform the problem
into a form that a BIT can handle.
● Sketch: The problem of range updates and range sums can be solved
with a double-application of difference arrays. A range_add(l, r,
delta) can be seen as:
1. A[i] += delta for i∈[l,r]. The prefix sum P[x] = sum(0, x)
is affected in a more complex way. The change in P[x] is ∑
2. i=l
3. x
4.
5. δ=(x−l+1)δ for x∈[l,r] and (r−l+1)δ for x>r. This is a
piecewise linear function. A standard offline trick is to
decompose the queries. range_sum(l, r) = prefix_sum(r) -
prefix_sum(l-1). So we only need to answer prefix sum
queries. We can process all updates and then answer all
queries. A range_add(l, r, delta) update can be modeled
using two BITs. Let BIT1 track the linear term delta and BIT2
track the constant term -delta * (l-1). A prefix_sum(x)
query would then be BIT1.query(x) * x + BIT2.query(x).
6. A range_add(l, r, delta) is implemented as: BIT1.add(l,
delta), BIT1.add(r+1, -delta), BIT2.add(l, -delta *
(l-1)), BIT2.add(r+1, delta * r).
7. A prefix_sum(x) is BIT1.query(x) * x + BIT2.query(x).
● Reasoning: This is a classic and efficient offline technique that uses
two BITs to simulate the effect of range additions on prefix sums.
Complexity is O((M+Q)logN) where M is number of updates.
● Mo's Algorithm / Sqrt Decomposition / HLD: Not Applicable.
● Reasoning: Not suited for this problem structure.
Recommended Approach and Strategic Insights:
● For the online version, Lazy Segment Tree is the solution.
● For the offline version, the two-BIT difference array technique is a very
clever and efficient method.
● This problem demonstrates that even for problems with a standard online
solution (Lazy SegTree), the offline constraint can open the door to
alternative, often simpler or faster (in terms of constant factors) solutions.
28. Range GCD with Updates (Advanced)
Problem Definition and Properties:
● Statement: Support update(i, val) and gcd(l, r) queries on an array A.
● Properties: This is the dynamic version of Problem 3. The standard solution
is a Segment Tree. This "advanced" version refers to a specific clever
optimization.
Algorithmic Applicability Analysis:
● Segment Tree: Applicable (Standard and Advanced).
● Standard Sketch: A Segment Tree storing GCDs in each node works
perfectly fine, with O(logN⋅logV) complexity for both operations.
● Advanced Sketch (on Differences): A cleverer approach uses a
Segment Tree on a difference array. Let B[i]=A[i]−A[i−1]. Then
gcd(A[l],A[l+1],...,A[r])=gcd(A[l],gcd(A[l+1]−A[l],...,A[r]−A[r−1
])) is not quite right. The property is gcd(x,y)=gcd(x,y−x). This means
gcd(A[l],...,A[r])=gcd(A[l],gcd(A[l+1],...A[r]))=gcd(A[l],gcd(A[l+
1]−A[l],A[l+2],...)). This can be extended:
● gcd(A[l…r])=gcd(A[l],gcd(A[l+1]−A[l],A[l+2]−A[l+1],…,A[r]−A
[r−1]))
● Let's define a new array D[i]=A[i]−A[i−1] for i>0. Then the query
becomes gcd(A[l],range_gcd(D[l+1…r])). This doesn't seem right.
The correct property is gcd(a
● 1
●
● ,…,a
● n
●
● )=gcd(a
● 1
●
● ,a
● 2
●
● −a
● 1
●
● ,…,a
● n
●
● −a
● n−1
●
● ). Let's use an array of adjacent differences B[i]=∣A[i]−A[i−1]∣. A
query gcd(l, r) can be computed as
gcd(A[l],range_gcd(B[l+1…r])). A point update update(i, val)
affects A[i], and also the differences B[i] and B[i+1]. We can
maintain the array A separately and build a Segment Tree on the
difference array B. An update now becomes one point update on A and
two point updates on the Segment Tree for B. A query becomes one
point access on A and one range query on the Segment Tree for B.
● Reasoning: This transformation is sometimes used in specific
contexts, but for simple point updates, the standard Segment Tree on
original values is just as efficient and much simpler. This technique
becomes more powerful for range updates (e.g., range add delta),
where a range add on A[l..r] becomes two point updates on the
difference array B (at l and r+1).
● All other algorithms: The applicability remains the same as in Problem 3.
Recommended Approach and Strategic Insights:
● A standard Segment Tree on the original values is the most straightforward
and efficient solution for point updates.
● The Segment Tree on differences is a powerful technique to have in mind,
as it transforms range additions into point updates, which can be a significant
optimization.
29. Dynamic Order Statistics (online k-th after updates)
Problem Definition and Properties:
● Statement: Support two online operations: update(i, val) and query(l, r,
k) to find the k-th smallest element in A[l..r].
● Properties: This is the fully dynamic, online version of Problem 6. It is one of
the most challenging standard query problems.
Algorithmic Applicability Analysis:
● Segment Tree: Applicable (of ordered sets/Fenwick Trees).
● Applicability: This requires a heavily augmented Segment Tree.
● Sketch: We build a Segment Tree over the array indices. Each node in
this Segment Tree, instead of storing a simple value, stores another
data structure that represents the collection of values in its range and
can answer order statistic queries. This inner data structure could be a
balanced binary search tree, a std::set or std::multiset in C++, or
even another Fenwick Tree on compressed values.
● An update(i, val) involves traversing to the leaf for index i
and updating the inner data structure in every node on that path.
This takes O(logN⋅logV) or O(log
● 2
● N).
● A query(l, r, k) involves a binary search over the possible
values x. To check a value x, we count how many elements in
[l,r] are ≤x. This count is found by performing a range query on
the main Segment Tree, and at each of the O(logN) nodes,
querying its inner data structure for count <= x. This also takes
logarithmic time. The total query time is O(logV⋅log
● 2
● N). A more direct "walking" query on the Segment Tree can
achieve O(log
● 2
● N).
● Reasoning: This is a very heavy but powerful online solution. It is often
called a "2D data structure" conceptually.
● Fenwick Tree (BIT): Applicable (of ordered sets).
● Sketch: Similar to the Segment Tree approach, we can have a
Fenwick Tree where each node contains a balanced BST. This is often
called a Fenwick-of-values or Fenwick-of-trees. The complexity is
similar, O(log
● 2
● N) for operations.
● Reasoning: A valid but complex alternative to the Segment Tree of
ordered sets.
● Mo's Algorithm: Not Applicable.
● Reasoning: The problem is online and dynamic.
● Square Root Decomposition: Possible.
● Sketch: Divide the array into blocks. Each block maintains a sorted list
or balanced BST of its elements. An update takes O(
● N
●
●
● ) or O(
● N
●
●
● logN) to rebuild the block's structure. A query involves binary
searching on the answer, and for each guess, counting elements in O(
● N
●
●
● log(
● N
●
●
● )) time.
● Reasoning: A simpler but slower O(Q
● N
●
●
● logV) approach.
● Heavy-Light Decomposition (HLD): Not Applicable.
● Reasoning: Array-based problem.
Recommended Approach and Strategic Insights:
● The Segment Tree of balanced BSTs/Fenwick Trees is the standard
(though complex) online solution.
● This problem sits at the boundary of what is practical in a typical programming
contest. It requires composing data structures, leading to high complexities
and implementation effort. It serves as a good example of how problems can
be solved by adding "dimensions" to data structures (a tree over indices,
where each node is a tree over values).
30. Tree Path k-th Smallest Query
Problem Definition and Properties:
● Statement: Given a tree with values on its nodes, answer queries (u, v, k):
find the k-th smallest value on the simple path between u and v. The tree is
static.
● Properties: A static, online order statistic query on a tree path.
Algorithmic Applicability Analysis:
● Heavy-Light Decomposition (HLD): Applicable (with a Merge Sort Tree or
similar).
● Sketch: We use HLD to decompose the path u-v into O(logN)
segments. The problem is now to find the k-th smallest element in a
collection of O(logN) ranges. This is a difficult merge step. We could
build a Merge Sort Tree on the HLD-linearized array. A query would
then involve querying O(logN) ranges on this Merge Sort Tree and
combining the results to find the global k-th element, which is complex.
● Persistent Segment Tree (with LCA): Applicable (Canonical Solution).
● Applicability: This is the standard and most elegant solution.
● Sketch: This solution combines the ideas from Range Median Query
(Problem 14) and path queries on trees.
1. We build a Persistent Segment Tree on the values of the nodes,
but we build it according to the tree structure. We perform a DFS
from the root. For each node u, we create a new version of the
Segment Tree by taking its parent's version and updating it with
value[u]. So, root[u] points to the root of a Segment Tree
representing the frequency map of values on the path from the
root of the tree to u.
2. A query (u, v, k) is answered using four segment tree
versions: root[u], root[v], root[lca(u,v)], and
root[parent(lca(u,v))].
3. The set of nodes on the path u-v is (nodes on path root-u) ∪
(nodes on path root-v) ∖ 2 * (nodes on path root-lca) ∪
(node lca). In terms of our frequency maps, the count of a value
x on the path is count(x, root[u]) + count(x, root[v]) -
2*count(x, root[lca]), with a correction for the LCA itself. We
can find the k-th element by walking down the four trees
simultaneously in O(logV) time.
● Reasoning: This is a beautiful combination of persistence and tree
algorithms, providing an efficient O(logV) online query after O(NlogV)
preprocessing.
● Mo's Algorithm: Applicable (on a tree).
● Sketch: We can apply Mo's on a tree (Problem 25) and use a
secondary data structure (like a BIT on values) to answer the k-th
smallest query within the active set of nodes.
● Reasoning: A viable offline solution, but with a high complexity of
O((N+Q)
● N
●
●
● logV).
Recommended Approach and Strategic Insights:
● The Persistent Segment Tree built on the tree structure is the canonical
online solution.
● This problem is a classic example of a hard query problem that requires a
deep synthesis of ideas: persistence, segment trees, and tree traversal
algorithms (LCA). It demonstrates how persistence can be used to capture
historical or, in this case, path-based states of a data structure.
Conclusion: A Decision Framework for Algorithm Selection
The preceding analysis of 30 distinct query problems demonstrates that the selection
of an optimal algorithm is a structured process, not an arbitrary choice. It is governed
by a clear set of principles related to the problem's constraints and the mathematical
properties of the query function. By internalizing these principles, one can move from
simple pattern matching to a more foundational, first-principles approach to
algorithmic problem-solving. This concluding section synthesizes these findings into
a practical decision-making framework and a comprehensive summary table to aid in
study and review.
A Decision-Making Flowchart
When confronted with a new query problem, a practitioner can systematically
determine the most promising algorithmic approaches by asking a series of targeted
questions:
1. What is the underlying data domain?
● Is it a linear array? Proceed to the next question.
● Is it a tree? The problem likely requires a structural decomposition
technique.
● Does the query involve subtrees? An Euler Tour to flatten the
tree into an array is the primary approach. The problem then
becomes an array range query problem on the [tin, tout]
intervals.
● Does the query involve paths between arbitrary nodes?
Heavy-Light Decomposition is the primary candidate for online
problems. For static offline problems, Mo's Algorithm on Trees
is an alternative. For simple navigation (e.g., k-th ancestor),
Binary Lifting is optimal.
2. Are the queries Online or Offline?
● Must queries be answered as they arrive (Online)? This rules out
Mo's Algorithm and sweep-line techniques. Segment Trees and
Fenwick Trees are the primary online tools.
● Can all queries be read in advance (Offline)? This opens up
powerful possibilities.
● Is the array static and the query involves non-decomposable
but incrementally updatable statistics (e.g., distinct count,
mode)? Mo's Algorithm is a strong candidate.
● Does the problem involve counting or summing based on a
threshold or a second dimension? A sweep-line algorithm,
often paired with a Fenwick Tree, might be more efficient.
3. Is the data Static or Dynamic?
● Is the data static (no updates)? Precomputation-heavy structures are
viable. For idempotent functions (min, max, gcd), a Sparse Table offers
unbeatable O(1) query time.
● Is the data dynamic (updates are involved)? The data structure
must be mutable.
● Are there point updates? Segment Trees and Fenwick Trees
are excellent choices. Sqrt Decomposition is a simpler, slower
alternative.
● Are there range updates? A Segment Tree with Lazy
Propagation is the standard and most powerful tool. The offline
two-BIT trick can handle range add/range sum.
4. What are the properties of the query function ⊕?
● Is it associative? This is the minimum requirement for a Segment
Tree.
● Is it invertible? If yes (and associative), a Fenwick Tree is likely the
simplest and fastest solution (e.g., sum, xor). If no (e.g., min, max, gcd),
a Fenwick Tree is not applicable.
● Is it idempotent? If yes (and associative, and the data is static), a
Sparse Table provides the fastest possible queries (O(1)).
By methodically proceeding through this flowchart, one can quickly narrow down the
field of potential algorithms and focus on the most appropriate candidates for the
problem at hand.
Master Summary Table
The following table provides a high-level summary of the applicability and complexity
of each of the five algorithms for the 30 problems analyzed in this report. This serves
as a dense reference for comparison and review.
● Applicability Legend:
● Rec. (Recommended): A standard, efficient, and often optimal
solution.
● Poss. (Possible): A valid but often less efficient or more complex
alternative.
● Poor Fit: Technically possible but highly inefficient or impractical.
● N/A (Not Applicable): The algorithm is fundamentally unsuited for the
problem's structure.
● Complexity: Stated for a single query/update operation. V denotes the range
of values in the array.
# Proble Segment Fenwick Tree Mo's Sqrt HLD
m Tree (BIT) Algorithm Decomposit
ion
A Array
-Base
d
Queri
es
1 Rang Rec. Rec. O(logN) N/A Poss. O( N/A
e O(logN) N
Sum,
Point
Updat
e
)
2 Rang Rec. N/A (not Poor Fit O( Poss. O( N/A
e Min, O(logN) invertible) N N
Point
Updat
e
logN) )
3 Rang Rec. N/A (not Poor Fit Poss. O( N/A
e O(logNlog invertible) N
GCD, V)
Point
Updat
e
logV)
4 Rang Rec. Rec. O(logN) Poss. O( Poss. O( N/A
e O(logN) N N
XOR,
Point
Updat
e
) )
5 Rang Rec. (Lazy) N/A N/A Poss. O( N/A
e O(logN) N
Max,
Rang
e Add
6 K-th Rec. (Merge N/A Poss. O( Poor Fit N/A
Small Sort) O(log N
est in 2
Rang
N)
e
logV)
7 Rang Poss. N/A Rec. O( Poss. O( N/A
e (Merge N N
Sort) O(log
Frequ 2
ency N)
Query
) )
8 Distin N/A Rec. (Offline) Rec. O( Poor Fit N/A
ct O(logN) N
Count
in
Rang
e
9 Path See See Problem See See See
Sum Problem 17 17 Problem 17 Problem 17 Problem
on 17
Tree
1 Subtr See See Problem See See See
0 ee Problem 16 16 Problem 16 Problem 16 Problem
Sum / 16
Updat
e
1 Distin See See Problem See See See
1 ct Problem 8 8 Problem 8 Problem 8 Problem 8
Count
in
Rang
e
1 Mode N/A (not N/A Rec. O( Poor Fit N/A
2 in decomposa N
Rang ble)
e
)
1 Rang Rec. N/A Poss. O( Poss. O( N/A
3 e (Boyer-Moor N N
Majori e) O(logN)
ty
Eleme
nt
) )
1 Rang Rec. Poss. (Offline) Poor Fit O( Poor Fit N/A
4 e (Persistent) O(logNlogV) N
Media O(logV)
n
Query
logN)
1 Rang N/A Rec. (in Mo's) Rec. O( N/A N/A
5 e N
Invers
ion
Count
logV)
B Tree
Queri
es
1 Subtr Rec. (Euler) Rec. (Euler) N/A Poss. O( Poss.
6 ee O(logN) O(logN) N O(logN)
Sum,
Point
Updat
e
1 Path Rec. (in Rec. (in HLD) Poss. N/A Rec.
7 Sum, HLD) (Static) O( O(log
Point N 2
Updat
N)
e
)
1 Subtr Rec. (Euler) N/A N/A Poss. O( Poss.
8 ee O(logN) N O(logN)
Min/M
ax,
Updat
e
1 Path Rec. (in N/A N/A N/A Rec.
9 Max, HLD) (Lazy)
Rang O(log
2
e Add
N)
2 Path Rec. (in Rec. (in HLD) Poss. N/A Rec.
0 XOR, HLD) (Static) O( O(log
Point N 2
Updat
N)
e
2 Count Poss. (in Rec. (in HLD Poss. N/A Rec.
1 > T on HLD) Sweep) (Static) O( (Offline)
Path N O(log
2
N)
2 K-th N/A N/A N/A N/A Poor Fit
2 Ances
tor
2 K-th N/A N/A N/A N/A Poss.
3 Node O(logN)
on
Path
2 Subtr Rec. (in Rec. (in Rec. (Euler) N/A N/A
4 ee Offline Offline O(
Distin Sweep) Sweep) N
ct
Count
2 Path N/A N/A Rec. (Tree) N/A N/A
5 Distin O(
ct N
Count
C Adva
nced
Scen
arios
2 Offline Rec. (over N/A N/A N/A N/A
6 Dyna Time)
mic
Conn
ectivit
y
2 Rang Rec. (Lazy) Rec. (2-BITs) N/A N/A N/A
7 e Add, O(logN) O(logN)
Rang
e Sum
(Offlin
e)
2 Rang Rec. N/A N/A N/A N/A
8 e O(logNlog
GCD V)
with
Updat
es
2 Dyna Rec. (of Poss. (of N/A Poss. O( N/A
9 mic BSTs) BSTs) O(log N
Order O(log 2
Statist
2 N)
N)
ics
logV)
3 Tree Rec. N/A Poss. (Tree) N/A Rec. (with
0 Path (Persistent) O( PST)
k-th N O(log
2
Small
N)
est
logV)
This content was created by another person. It may be inaccurate or unsafe. Report unsafe content
Opens in a new window