Algo: Euler Tour Technique for Tree Queries

Efficiently querying subtrees and paths in tree structures is a common challenge in algorithm design, from network analysis to competitive programming. Naive methods often lead to slow $O (N)$ queries, but the Euler tour technique transforms a tree into a linear array, enabling $O (lo g N)$ operations using data structures like segment trees. By mastering this approach, you can handle complex tree queries with elegance and speed, making it a cornerstone for advanced graph algorithms.

Understanding Euler Tour Linearization

At its core, the Euler tour technique is a form of depth-first search (DFS) that "flattens" a tree into an array. Imagine walking along the edges of a tree, starting from the root, and recording each node every time you visit it—both when you enter and when you exit after exploring its children. This traversal produces an Euler tour array, a linear sequence where every node appears multiple times. The key insight is that for any node $u$ , all nodes in its subtree appear consecutively between its first entry and last exit in the array. This contiguous mapping turns subtree operations into range queries on an array.

Consider a simple tree with root 1 and children 2 and 3. A DFS traversal might yield an Euler tour array like $[1, 2, 2, 3, 3, 1]$ , where each node's subtree corresponds to a slice: for node 2, the range $[2, 3]$ (using 1-based indexing) contains its subtree. This linearization is powerful because it allows you to apply array-based data structures. You'll typically store two indices per node: $s t a r t [u]$ for its first occurrence and $e n d [u]$ for its last, defining the inclusive range $[s t a r t [u], e n d [u]]$ in the Euler array that represents subtree $u$ .

Constructing the Euler Tour Array

Building the Euler tour array requires a straightforward DFS. You initialize an empty array $E$ and a counter $t im e = 0$ . For each node $u$ , you record $s t a r t [u] = + + t im e$ and append $u$ to $E$ upon entry. Then, you recursively visit all children of $u$ . After processing children, you record $e n d [u] = t im e$ and optionally append $u$ again upon exit, depending on the variant. The two common variants are:

Euler Tour (ET): Append node on entry and exit, so each edge is traversed twice.
Euler Tour Tree (ETT): Often used for link-cut trees, but for basic queries, the entry-only variant suffices if weights are handled carefully.

Here is a step-by-step Python-style pseudocode for the entry-exit variant:

E = []  # Euler tour array
start = [0] * (n+1)
end = [0] * (n+1)
time = 0

def dfs(u, parent):
    global time
    time += 1
    start[u] = time
    E.append(u)
    for v in children[u]:
        if v != parent:
            dfs(v, u)
    end[u] = time
    E.append(u)  # optional exit append

In practice, for subtree queries, you might only need the entry times, as $e n d [u]$ marks the last node in the subtree. The array $E$ based on entry times has size $N$ , and $s t a r t [u]$ to $e n d [u]$ covers all nodes in $u$ 's subtree contiguously. This construction runs in $O (N)$ time and space, laying the groundwork for efficient queries.

Handling Subtree Queries with Segment Trees

With the tree linearized, you can answer subtree queries—like sum or minimum of node values—by performing range queries on the Euler array. This is where data structures like segment trees or Fenwick trees shine, offering $O (lo g N)$ query and update times. First, map each node $u$ to its value in the Euler array: for subtree queries, use an array $A$ where $A [s t a r t [u]]$ stores the value of node $u$ . Then, build a segment tree over $A$ .

For a subtree sum query on node $u$ , you query the segment tree over the range $[s t a r t [u], e n d [u]]$ to get the total sum of values in subtree $u$ . Similarly, for a subtree minimum query, you query for the minimum in that range. Updates, such as changing the value of a node $u$ , are equally efficient: simply update $A [s t a r t [u]]$ in the segment tree. This transforms tree problems into familiar array problems.

For example, suppose you have node values: $v a l [1] = 5, v a l [2] = 3, v a l [3] = 7$ . After Euler tour with entry-only, $A = [5, 3, 7]$ corresponding to starts $[1, 2, 3]$ . For subtree sum at root 1, query range $[1, 3]$ gives $15$ . A segment tree allows answering this in $O (lo g N)$ after $O (N)$ preprocessing. The process is:

Build Euler tour to get $s t a r t$ and $e n d$ arrays.
Create array $A$ of size $N$ with $A [s t a r t [u]] = v a l [u]$ .
Build a segment tree on $A$ for operations like sum or min.
For subtree query on $u$ , call segment_tree.query(start[u], end[u]).

This approach is versatile and can be extended to other associative operations like maximum or XOR.

Extending to Path Queries via Binary Lifting

While subtree queries are straightforward, path queries—such as finding the sum or minimum along the path between two nodes—require combining Euler tour with binary lifting for lowest common ancestor (LCA). Binary lifting precomputes ancestors at powers of two to find LCA in $O (lo g N)$ , which helps break paths into manageable segments.

To handle path queries, you can use the Euler tour array in conjunction with a segment tree that supports range queries. However, for path queries, you need to consider the path from node $u$ to node $v$ , which isn't contiguous in the Euler array. Instead, you can decompose the path into two parts: from $u$ to LCA(u,v) and from LCA(u,v) to $v$ . By storing node depths or using techniques like heavy-light decomposition, you can adapt the segment tree.

A common method is to use the Euler tour for subtree queries and binary lifting separately for path queries. For instance, to compute the minimum value on the path from $u$ to $v$ :

Find LCA of $u$ and $v$ using binary lifting.
Break the path into upward segments: $u$ to LCA and $v$ to LCA.
Since these segments aren't contiguous in the Euler array, you might use a sparse table or segment tree on the array of nodes sorted by depth, but a simpler approach is to use binary lifting to aggregate values along the path.

Alternatively, you can store additional data in the segment tree based on depth. In practice, for pure path queries, Euler tour is often combined with other techniques. The key takeaway is that binary lifting enables efficient path navigation, while Euler tour optimizes subtree operations. Together, they provide a comprehensive toolkit for tree queries.

Common Pitfalls

When implementing the Euler tour technique, several mistakes can lead to incorrect results or inefficiencies. Here are key pitfalls and how to avoid them:

Incorrect Range Calculation for Subtrees: Using $s t a r t [u]$ and $e n d [u]$ as the range for subtree $u$ is correct only if $e n d [u]$ is properly updated during DFS. Ensure that $e n d [u]$ is set after processing all children, typically as the current time value. A common error is to set $e n d [u]$ before DFS calls, which truncates the range. Always update $e n d [u]$ post-traversal.

Misalignment of Node Values in the Array: When building array $A$ for segment tree, you must place node values at positions $s t a r t [u]$ . If you use the full Euler array with duplicate entries, values might appear multiple times, skewing sums or mins. For subtree queries, use the entry-only variant and map $v a l [u]$ to $A [s t a r t [u]]$ . For path queries, ensure consistency in indexing.

Inefficient Query Handling for Paths: Attempting to use the Euler array directly for path queries without LCA can lead to $O (N)$ complexity. Always combine with binary lifting or similar methods to break paths into $O (lo g N)$ segments. Don't assume the Euler array magically handles all query types; understand its limitations.

Overlooking Update Operations: When node values change, you must update the segment tree at $s t a r t [u]$ . Forgetting this can stale queries. In dynamic scenarios, use a segment tree with point updates, and always synchronize $A [s t a r t [u]]$ with the new value.

Summary

The Euler tour technique linearizes a tree into an array via DFS, where each subtree maps to a contiguous range defined by $s t a r t [u]$ and $e n d [u]$ .
This linearization enables efficient subtree queries (e.g., sum or minimum) using array data structures like segment trees, reducing query time to $O (lo g N)$ .
Implementation involves constructing the Euler tour array in $O (N)$ time, then building a segment tree over node values positioned by $s t a r t$ indices.
For path queries between two nodes, combine Euler tour with binary lifting to find LCA and decompose paths, maintaining logarithmic complexity.
Avoid pitfalls such as incorrect range calculation and value misalignment by carefully tracking indices during DFS and updates.

Algo: Euler Tour Technique for Tree Queries

Algo: Euler Tour Technique for Tree Queries

Understanding Euler Tour Linearization

Constructing the Euler Tour Array

Handling Subtree Queries with Segment Trees

Extending to Path Queries via Binary Lifting

Common Pitfalls

Summary

Write better notes with AI