Algo: Euler Tour Technique for Tree Queries
Algo: Euler Tour Technique for Tree Queries
Efficiently querying subtrees and paths in tree structures is a common challenge in algorithm design, from network analysis to competitive programming. Naive methods often lead to slow queries, but the Euler tour technique transforms a tree into a linear array, enabling operations using data structures like segment trees. By mastering this approach, you can handle complex tree queries with elegance and speed, making it a cornerstone for advanced graph algorithms.
Understanding Euler Tour Linearization
At its core, the Euler tour technique is a form of depth-first search (DFS) that "flattens" a tree into an array. Imagine walking along the edges of a tree, starting from the root, and recording each node every time you visit it—both when you enter and when you exit after exploring its children. This traversal produces an Euler tour array, a linear sequence where every node appears multiple times. The key insight is that for any node , all nodes in its subtree appear consecutively between its first entry and last exit in the array. This contiguous mapping turns subtree operations into range queries on an array.
Consider a simple tree with root 1 and children 2 and 3. A DFS traversal might yield an Euler tour array like , where each node's subtree corresponds to a slice: for node 2, the range (using 1-based indexing) contains its subtree. This linearization is powerful because it allows you to apply array-based data structures. You'll typically store two indices per node: for its first occurrence and for its last, defining the inclusive range in the Euler array that represents subtree .
Constructing the Euler Tour Array
Building the Euler tour array requires a straightforward DFS. You initialize an empty array and a counter . For each node , you record and append to upon entry. Then, you recursively visit all children of . After processing children, you record and optionally append again upon exit, depending on the variant. The two common variants are:
- Euler Tour (ET): Append node on entry and exit, so each edge is traversed twice.
- Euler Tour Tree (ETT): Often used for link-cut trees, but for basic queries, the entry-only variant suffices if weights are handled carefully.
Here is a step-by-step Python-style pseudocode for the entry-exit variant:
E = [] # Euler tour array
start = [0] * (n+1)
end = [0] * (n+1)
time = 0
def dfs(u, parent):
global time
time += 1
start[u] = time
E.append(u)
for v in children[u]:
if v != parent:
dfs(v, u)
end[u] = time
E.append(u) # optional exit appendIn practice, for subtree queries, you might only need the entry times, as marks the last node in the subtree. The array based on entry times has size , and to covers all nodes in 's subtree contiguously. This construction runs in time and space, laying the groundwork for efficient queries.
Handling Subtree Queries with Segment Trees
With the tree linearized, you can answer subtree queries—like sum or minimum of node values—by performing range queries on the Euler array. This is where data structures like segment trees or Fenwick trees shine, offering query and update times. First, map each node to its value in the Euler array: for subtree queries, use an array where stores the value of node . Then, build a segment tree over .
For a subtree sum query on node , you query the segment tree over the range to get the total sum of values in subtree . Similarly, for a subtree minimum query, you query for the minimum in that range. Updates, such as changing the value of a node , are equally efficient: simply update in the segment tree. This transforms tree problems into familiar array problems.
For example, suppose you have node values: . After Euler tour with entry-only, corresponding to starts . For subtree sum at root 1, query range gives . A segment tree allows answering this in after preprocessing. The process is:
- Build Euler tour to get and arrays.
- Create array of size with .
- Build a segment tree on for operations like sum or min.
- For subtree query on , call
segment_tree.query(start[u], end[u]).
This approach is versatile and can be extended to other associative operations like maximum or XOR.
Extending to Path Queries via Binary Lifting
While subtree queries are straightforward, path queries—such as finding the sum or minimum along the path between two nodes—require combining Euler tour with binary lifting for lowest common ancestor (LCA). Binary lifting precomputes ancestors at powers of two to find LCA in , which helps break paths into manageable segments.
To handle path queries, you can use the Euler tour array in conjunction with a segment tree that supports range queries. However, for path queries, you need to consider the path from node to node , which isn't contiguous in the Euler array. Instead, you can decompose the path into two parts: from to LCA(u,v) and from LCA(u,v) to . By storing node depths or using techniques like heavy-light decomposition, you can adapt the segment tree.
A common method is to use the Euler tour for subtree queries and binary lifting separately for path queries. For instance, to compute the minimum value on the path from to :
- Find LCA of and using binary lifting.
- Break the path into upward segments: to LCA and to LCA.
- Since these segments aren't contiguous in the Euler array, you might use a sparse table or segment tree on the array of nodes sorted by depth, but a simpler approach is to use binary lifting to aggregate values along the path.
Alternatively, you can store additional data in the segment tree based on depth. In practice, for pure path queries, Euler tour is often combined with other techniques. The key takeaway is that binary lifting enables efficient path navigation, while Euler tour optimizes subtree operations. Together, they provide a comprehensive toolkit for tree queries.
Common Pitfalls
When implementing the Euler tour technique, several mistakes can lead to incorrect results or inefficiencies. Here are key pitfalls and how to avoid them:
- Incorrect Range Calculation for Subtrees: Using and as the range for subtree is correct only if is properly updated during DFS. Ensure that is set after processing all children, typically as the current time value. A common error is to set before DFS calls, which truncates the range. Always update post-traversal.
- Misalignment of Node Values in the Array: When building array for segment tree, you must place node values at positions . If you use the full Euler array with duplicate entries, values might appear multiple times, skewing sums or mins. For subtree queries, use the entry-only variant and map to . For path queries, ensure consistency in indexing.
- Inefficient Query Handling for Paths: Attempting to use the Euler array directly for path queries without LCA can lead to complexity. Always combine with binary lifting or similar methods to break paths into segments. Don't assume the Euler array magically handles all query types; understand its limitations.
- Overlooking Update Operations: When node values change, you must update the segment tree at . Forgetting this can stale queries. In dynamic scenarios, use a segment tree with point updates, and always synchronize with the new value.
Summary
- The Euler tour technique linearizes a tree into an array via DFS, where each subtree maps to a contiguous range defined by and .
- This linearization enables efficient subtree queries (e.g., sum or minimum) using array data structures like segment trees, reducing query time to .
- Implementation involves constructing the Euler tour array in time, then building a segment tree over node values positioned by indices.
- For path queries between two nodes, combine Euler tour with binary lifting to find LCA and decompose paths, maintaining logarithmic complexity.
- Avoid pitfalls such as incorrect range calculation and value misalignment by carefully tracking indices during DFS and updates.