Skip to content
4 days ago

Graph Interview Patterns

MA
Mindli AI

Graph Interview Patterns

Understanding how to solve problems on graphs is a non-negotiable skill for technical interviews. While you may not be asked to implement a complex algorithm from scratch, you will consistently face problems that map onto a handful of fundamental graph patterns. Mastering these patterns transforms a daunting graph problem into a structured application of a known technique, allowing you to focus on modeling the problem correctly and writing clean, efficient code.

Graph Representation and Problem Modeling

The first and often most critical step is recognizing that a problem is a graph problem and choosing the right representation. Many interview problems are not presented with obvious nodes and edges; you must identify the underlying graph structure. For example, a social network (users and friendships), a maze (cells and passages), or a set of tasks with dependencies are all graphs in disguise.

The two primary representations are the adjacency matrix and the adjacency list. For interview problems, the adjacency list is almost always the correct choice. It provides O(1) access to a node's neighbors and uses space proportional to the number of edges O(V + E), which is efficient for the sparse graphs common in interviews. Your mental model should be: a node (or vertex) is a key, and its value is a list of other nodes it connects to. Effective visited tracking is also fundamental; you must use a set, array, or map to mark nodes you've already processed to avoid infinite loops, especially in graphs with cycles.

# Typical adjacency list representation for an undirected graph
graph = {
    'A': ['B', 'C'],
    'B': ['A', 'D'],
    'C': ['A', 'D'],
    'D': ['B', 'C']
}

visited = set() # Crucial for avoiding re-processing

Core Pattern 1: Connected Components and Union-Find

A connected component is a subgraph where any two vertices are connected by a path. Problems asking for the number of isolated groups, checking if all nodes are connected, or dynamically connecting components point to this pattern. You can solve this with either Depth-First Search (DFS) or Breadth-First Search (BFS) by counting how many times you need to start a new traversal to cover all nodes.

However, for problems involving dynamic connectivity (e.g., "given a list of connections, determine if two points are connected at any time"), the union-find (or Disjoint Set Union) data structure is optimal. It provides near-constant time operations for union (connecting two components) and find (determining which component a node belongs to). The pattern involves initializing each node as its own parent, then using find with path compression and union by rank to merge sets efficiently.

# Union-Find skeleton
class UnionFind:
    def __init__(self, n):
        self.parent = list(range(n))
        self.rank = [0] * n

    def find(self, x):
        if self.parent[x] != x:
            self.parent[x] = self.find(self.parent[x]) # Path compression
        return self.parent[x]

    def union(self, x, y):
        rootX, rootY = self.find(x), self.find(y)
        if rootX != rootY:
            # Union by rank
            if self.rank[rootX] < self.rank[rootY]:
                self.parent[rootX] = rootY
            elif self.rank[rootX] > self.rank[rootY]:
                self.parent[rootY] = rootX
            else:
                self.parent[rootY] = rootX
                self.rank[rootX] += 1
            return True # Components were merged
        return False # Already in the same component

Core Pattern 2: Shortest Paths and Level-Order Traversal with BFS

For unweighted graphs, Breadth-First Search (BFS) is the workhorse for finding the shortest path between two nodes. BFS explores a graph layer by layer, guaranteeing that the first time you reach a node, you've done so via the shortest number of edges. This makes it perfect for "minimum steps," "degree of separation," or "shortest transformation sequence" problems.

The standard implementation uses a queue. You start by enqueuing the source node with a distance of 0. Then, while the queue is not empty, you dequeue a node, iterate through its unvisited neighbors, mark them as visited, set their distance, and enqueue them. The distance to the target, when found, is your answer. Remember to track the level or distance for each node, often by storing it alongside the node in the queue or in a separate distance array.

from collections import deque

def bfs_shortest_path(graph, start, target):
    if start == target:
        return 0
    queue = deque([start])
    visited = {start}
    distance = {start: 0}

    while queue:
        node = queue.popleft()
        for neighbor in graph[node]:
            if neighbor not in visited:
                if neighbor == target:
                    return distance[node] + 1
                visited.add(neighbor)
                distance[neighbor] = distance[node] + 1
                queue.append(neighbor)
    return -1 # No path exists

Core Pattern 3: Cycle Detection and DFS Coloring

Cycle detection is crucial for many applications, from deadlock detection to validating graph properties. For undirected graphs, a simple DFS with a "visited" set and a "parent" reference (to avoid mistaking the edge back to where you came from as a cycle) is sufficient. For directed graphs, the classic approach is DFS coloring, also known as the three-state algorithm.

In this method, each node is in one of three states:

  • 0 (Unvisited): The node has not been processed.
  • 1 (Visiting): The node is in the current recursion stack. You have started exploring its descendants but have not finished.
  • 2 (Visited): The node and all its descendants have been fully processed.

A cycle exists if, during DFS, you encounter a neighbor whose state is 1 (Visiting). This indicates a back edge to an ancestor in the current path, forming a cycle. This pattern is directly applicable to problems like scheduling courses with prerequisites (if there's a cycle, it's impossible).

def has_cycle_directed(graph):
    state = {node: 0 for node in graph} # 0=unvisited, 1=visiting, 2=visited

    def dfs(node):
        state[node] = 1 # Mark as visiting
        for neighbor in graph.get(node, []):
            if state[neighbor] == 0:
                if dfs(neighbor):
                    return True
            elif state[neighbor] == 1: # Found a back edge!
                return True
        state[node] = 2 # Mark as fully processed
        return False

    for node in graph:
        if state[node] == 0:
            if dfs(node):
                return True
    return False

Core Pattern 4: Topological Ordering for Dependencies

Topological ordering is a linear ordering of a directed graph's vertices such that for every directed edge , vertex comes before in the ordering. It's the natural solution for dependency resolution problems: "You must take course A before course B," or "Task X must complete before task Y."

There are two main algorithms: Kahn's Algorithm (BFS-based using in-degree counts) and a DFS-based approach. Kahn's Algorithm is often more intuitive: repeatedly remove nodes with zero in-degree (no unmet dependencies) from the graph, adding them to the order. If you cannot remove all nodes, a cycle exists. The DFS approach uses a variation of the cycle detection coloring to produce a reverse post-order traversal, which is a valid topological sort.

# Kahn's Algorithm (BFS-based)
from collections import deque

def topological_sort(num_courses, prerequisites):
    # Build graph and in-degree count
    graph = [[] for _ in range(num_courses)]
    in_degree = [0] * num_courses
    for course, prereq in prerequisites:
        graph[prereq].append(course)
        in_degree[course] += 1

    # Queue all nodes with zero in-degree
    queue = deque([i for i in range(num_courses) if in_degree[i] == 0])
    order = []

    while queue:
        node = queue.popleft()
        order.append(node)
        for neighbor in graph[node]:
            in_degree[neighbor] -= 1
            if in_degree[neighbor] == 0:
                queue.append(neighbor)

    # If order contains all nodes, it's a valid topological sort
    return order if len(order) == num_courses else []

Core Pattern 5: Bipartite Graph Checking

A graph is bipartite if its vertices can be divided into two disjoint sets such that every edge connects a vertex from one set to a vertex from the other set (no edges within the same set). This is equivalent to the graph being 2-colorable. Problems about dividing a group into two conflicting teams or detecting odd-length cycles map here.

You can solve this using BFS or DFS. The algorithm is simple: start a traversal from any node, color it blue (0). Color all its neighbors the opposite color, red (1). Continue this process. If you ever find an edge connecting two nodes of the same color, the graph is not bipartite. You must perform this check for every connected component, as the graph may be disconnected.

def is_bipartite(graph):
    color = {} # 0 for one color, 1 for the other
    for node in graph:
        if node not in color: # New component
            color[node] = 0
            queue = deque([node])
            while queue:
                curr = queue.popleft()
                for neighbor in graph[curr]:
                    if neighbor not in color:
                        color[neighbor] = 1 - color[curr] # Assign opposite color
                        queue.append(neighbor)
                    elif color[neighbor] == color[curr]:
                        return False # Conflict found
    return True

Common Pitfalls

  1. Incomplete Visited Tracking: Forgetting to mark a node as visited before adding it to the queue/stack can lead to duplicate processing and, in the worst case, infinite loops. The correct pattern is to mark the source node as visited, then mark each neighbor as visited the moment you discover it, before enqueuing/pushing it.
  2. Misapplying BFS for Weighted Graphs: BFS finds the shortest path in terms of number of edges, not the path with the minimum sum of edge weights. For weighted graphs, you must use algorithms like Dijkstra's or Bellman-Ford. A common trap is to see "shortest path" and immediately jump to BFS without checking for edge weights.
  3. Confusing Directed and Undirected Cycle Detection: Using the simple undirected graph method (checking for a neighbor that is visited and not the parent) on a directed graph will incorrectly flag many graphs as having cycles. Always use the three-state coloring method for directed cycle detection.
  4. Overlooking Disconnected Graphs: Many graph algorithms (like bipartite checking or connected component counting) must run on every component. Starting a single BFS/DFS from node 0 and assuming it covers the entire graph is a frequent error. Always iterate through all nodes, initiating a new traversal for any unvisited node.

Summary

  • Model the Problem: Your first task is to identify nodes (entities) and edges (relationships) within the problem statement. Choose an adjacency list for representation.
  • Pattern Recognition is Key: Map the problem goal to a core pattern: counting groups (Union-Find/DFS), shortest unweighted path (BFS), dependency ordering (Topological Sort), checking for contradictions or two-team divisions (Bipartite Check), or validating no circular dependencies (Cycle Detection).
  • DFS vs. BFS: Use DFS for exploring all possibilities, pathfinding, cycle detection, and topological sort. Use BFS for shortest path problems in unweighted graphs and level-order explorations.
  • State Management is Critical: Robust visited tracking and, where needed, distance tracking, coloring, or in-degree counting are what make these algorithms work correctly and efficiently.
  • Practice Implementation: The code skeletons for these patterns are highly consistent. Internalize them so you can adapt them quickly to a new problem's specific constraints and output requirements during an interview.

Write better notes with AI

Mindli helps you capture, organize, and master any subject with AI-powered summaries and flashcards.